Skip to content

Conversation

@sei-vsarvepalli
Copy link
Contributor

This is for fixing the problem in Issu #1060 where CSV files with commas and quote delimited commas could not be uploaded to the calculator. some small security fix to where one can use a malicious filename to inject HTML - not necessarily XSS though.

This pull request refactors and modernizes the CSV/TSV file import logic in ssvc.js, replacing the old parse_file function with a new implementation that builds a more structured and extensible schema for imported decision trees. It also updates how imported files are added to the UI and ensures imported trees are tracked and selectable.

Major improvements to file import and schema handling:

  • Replaces the old parse_file function with a new version that:
    • Detects file version (CSVv1/CSVv2), generates unique short keys for headers and values, and constructs a structured schema for decision points and mappings, improving extensibility and maintainability.
    • Automatically adds imported decision trees to the global SSVC.decision_trees array and updates the UI dropdown, ensuring users can select the newly imported tree.
    • Updates the select_add_option function to accept an explicit display text for dropdown options, supporting better labeling of imported files.

Integration and function usage updates:

  • Updates the readFile function to pass the filename to parse_file, allowing the new schema to include file metadata and ensuring correct dropdown labeling.

These changes make the import process more robust, user-friendly, and ready for future enhancements.

@sei-vsarvepalli sei-vsarvepalli self-assigned this Jan 22, 2026
@sei-vsarvepalli sei-vsarvepalli added the bug Something isn't working label Jan 22, 2026
Copilot AI review requested due to automatic review settings January 22, 2026 23:13
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the SSVC calculator’s CSV/TSV import flow to address Issue #1060 by refactoring the legacy CSV parser into a new “imported decision tree schema” builder and improving how imported trees are added/selected in the UI.

Changes:

  • Updates select_add_option to support explicit display text and use .text() (safer than .html() for user-controlled strings).
  • Refactors CSV/TSV import by replacing the old parse_file logic with a new implementation that builds a structured schema and routes it through parse_json.
  • Updates readFile to pass the uploaded filename into parse_file for better labeling/metadata.

Comment on lines +1445 to +1449
let key = str[0];
let j = 2;
while (used[key]) {
key = str.substr(0, j);
j++;
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uniquePrefix can enter an infinite loop when str is empty or when two headers are identical/short enough that str.substr(0, j) never changes (e.g., duplicate 1-character headers). Add guards for empty strings and ensure the generated key changes (e.g., fall back to appending a numeric suffix once j exceeds str.length).

Suggested change
let key = str[0];
let j = 2;
while (used[key]) {
key = str.substr(0, j);
j++;
// Guard against empty or non-string inputs to avoid infinite loops
if (typeof str !== 'string' || str.length === 0) {
let base = 'k';
let suffix = 1;
let key = base;
while (used[key]) {
key = base + suffix;
suffix += 1;
}
used[key] = str;
return key;
}
let key = str[0];
let j = 2;
let suffix = 1;
while (used[key]) {
if (j <= str.length) {
// Try increasingly long prefixes of the string
key = str.substr(0, j);
j += 1;
} else {
// All prefixes are taken; fall back to appending a numeric suffix
key = str + '_' + suffix;
suffix += 1;
}

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed as the empty string is not possible for this variable.

Comment on lines +1477 to +1481
if (x.indexOf('","') > -1 && x[0] === '"' && x.at(-1) === '"') {
x = x.substr(1, x.length - 2);
return x.split('","');
}
return x.split(/[\t,]+/);
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CSV parsing here only handles rows where the entire line is wrapped in quotes and fields are separated by ",". Common CSV like 1,"a,b",c (quoted field only) will fall back to the regex split and still break on embedded commas/quotes. Consider using a real CSV parser (or a state-machine parser) that supports RFC4180-style quoting/escaped quotes and tabs.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CSV is a particular format of CSV, that is machine generated, so this is also not necessary

}
};

const xarray = xraw.split("\n").filter(Boolean);
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting lines with split("\n") leaves \r on CRLF files, which can break the quoted-line detection (x.at(-1) === '"') and pollute the last field/header. Prefer splitting on /\r?\n/ (or trimming \r) before further parsing.

Suggested change
const xarray = xraw.split("\n").filter(Boolean);
const xarray = xraw.split(/\r?\n/).filter(Boolean);

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be included in the next commit

Comment on lines 1505 to 1508
name: headers[i],
schemaVersion: schemaVersion,
definition: headers[i], // full header name
values: [] // value objects
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Headers and values from an imported file are treated as trusted strings and later rendered in the UI; parse_json uses .html(...) for some label rendering, so a malicious header/value could inject HTML/JS. Either sanitize/escape imported headers/values before building the schema, or change the downstream rendering to use .text(...) for user-controlled content.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will update parse_json to change html to text

Comment on lines 1522 to 1546
row.forEach((value, i) => {
const dpkey = namespace + ":" + headerKeys[i] + ":" + version;
const seen = valuemap[i];

// incremental prefix key for value
let key = value[0];
let j = 2;
while (seen[key] && seen[key] !== value) {
key = value.substr(0, j);
j++;
}

// add new value to decision_point if needed
if (!(key in seen)) {
imported.data.decision_points[dpkey].values.push({
key, // incremental prefix
name: value, // display name
definition: value // textual description for SSVC schema
});
seen[key] = value;
}

// add mapping for this row
rowMapping[dpkey] = key;
});
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let key = value[0]; will produce undefined keys for empty strings, and will throw if value is undefined (which can happen because the regex split collapses consecutive delimiters and rows may have fewer columns than headers). Normalize values (trim, default empty to a safe token) and ensure each row is padded/validated against headers.length before generating keys/mappings.

Suggested change
row.forEach((value, i) => {
const dpkey = namespace + ":" + headerKeys[i] + ":" + version;
const seen = valuemap[i];
// incremental prefix key for value
let key = value[0];
let j = 2;
while (seen[key] && seen[key] !== value) {
key = value.substr(0, j);
j++;
}
// add new value to decision_point if needed
if (!(key in seen)) {
imported.data.decision_points[dpkey].values.push({
key, // incremental prefix
name: value, // display name
definition: value // textual description for SSVC schema
});
seen[key] = value;
}
// add mapping for this row
rowMapping[dpkey] = key;
});
// Iterate over expected columns only, normalizing missing/empty values
for (let i = 0; i < headerKeys.length; i++) {
const dpkey = namespace + ":" + headerKeys[i] + ":" + version;
const seen = valuemap[i];
// Normalize cell value: coerce to string, trim, and default empties
const rawValue = row[i] != null ? String(row[i]) : "";
let value = rawValue.trim();
if (value === "") {
// Safe token for empty/missing cells
value = "[empty]";
}
// Generate a short, collision-resistant key for this column's value
const keySource = value;
const baseKey = keySource.charAt(0) || "_";
let key = baseKey;
let suffix = 1;
while (seen[key] && seen[key] !== keySource) {
key = baseKey + String(suffix);
suffix++;
}
// add new value to decision_point if needed
if (!(key in seen)) {
imported.data.decision_points[dpkey].values.push({
key, // incremental key
name: value, // display name
definition: value // textual description for SSVC schema
});
seen[key] = keySource;
}
// add mapping for this row
rowMapping[dpkey] = key;
}

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not useful suggestion as "_" is not allowed in some of the fields as beginning.

});
topalert("Decision tree has been updated with "+
imported.data.mapping.length+" possible decisions using "+detect_version+" CSV/TSV file, You can use it now!","success");
console.log(imported);
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

console.log(imported) will log the entire imported decision tree (including potentially sensitive local file contents) in production. Consider removing this or gating it behind an explicit debug flag.

Suggested change
console.log(imported);

Copilot uses AI. Check for mistakes.
const headers = xr[0];
const rows = xr.slice(1);

let detect_version = "CSVv1"
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid automated semicolon insertion (95% of all statements in the enclosing function have an explicit semicolon).

Suggested change
let detect_version = "CSVv1"
let detect_version = "CSVv1";

Copilot uses AI. Check for mistakes.
@sei-vsarvepalli sei-vsarvepalli marked this pull request as draft January 23, 2026 16:03
Signed-off-by: Vijay Sarvepalli <vssarvepalli@cert.org>
@sei-vsarvepalli sei-vsarvepalli marked this pull request as ready for review January 23, 2026 16:46
Copy link
Contributor

@sei-renae sei-renae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calc works now, but I didn't try giving it malicious json or csv.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants