-
Notifications
You must be signed in to change notification settings - Fork 0
Feature/chamap #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
4b6a621
b9fabff
b8e276b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -54,6 +54,7 @@ def generate_select_query(config_data, source_table, db_type='MySQL'): | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Generate a SELECT query based on configuration. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Applies TRIM at source for MSSQL CHAR columns to prevent padding. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| For SQL Server: Also cleans non-breaking spaces and control characters at source. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| For SQL Server: Also cleans non-breaking spaces and control characters at source. | |
| For SQL Server: Also replaces non-breaking spaces (CHAR(160)) with regular spaces and removes null bytes (CHAR(0)) at source. |
Copilot
AI
Feb 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The column identifier uses double quotes directly from the source column name without sanitization. If the source column name contains malicious content (e.g., special characters like double quotes), this could lead to SQL injection. Consider using SQLAlchemy's identifier quoting mechanism or validating/sanitizing column names before using them in queries.
Copilot
AI
Feb 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The REPLACE operations are unconditionally applied to all columns for SQL Server, regardless of the column's data type. This could cause issues with numeric, date, or binary columns where REPLACE operations on character data are not appropriate. Consider checking the column data type or restricting this transformation to text/varchar column types only.
| # Apply TRIM if specified in transformers | |
| if 'TRIM' in mapping.get('transformers', []): | |
| col_expr = f'TRIM({col_expr})' | |
| # Clean non-breaking spaces and problematic characters for VARCHAR/NVARCHAR/TEXT columns | |
| # REPLACE(col, CHAR(160), ' ') -> replace nbsp with regular space | |
| # REPLACE(col, CHAR(0), '') -> remove null bytes | |
| col_expr = f'REPLACE(REPLACE({col_expr}, CHAR(160), \' \'), CHAR(0), \'\')' | |
| # Determine if the source column is a text-like type where string REPLACE is appropriate. | |
| # This relies on an optional 'source_type' field in the mapping (e.g. 'VARCHAR(50)', 'NVARCHAR', 'TEXT'). | |
| source_type_raw = str(mapping.get('source_type', '') or '') | |
| source_type_upper = source_type_raw.upper() | |
| text_type_prefixes = ('CHAR', 'NCHAR', 'VARCHAR', 'NVARCHAR') | |
| is_text_column = ( | |
| any(source_type_upper.startswith(prefix) for prefix in text_type_prefixes) | |
| or source_type_upper in ('TEXT', 'NTEXT') | |
| ) | |
| # Apply TRIM if specified in transformers | |
| if 'TRIM' in mapping.get('transformers', []): | |
| col_expr = f'TRIM({col_expr})' | |
| # Clean non-breaking spaces and problematic characters only for text-like columns. | |
| # REPLACE(col, CHAR(160), ' ') -> replace nbsp with regular space | |
| # REPLACE(col, CHAR(0), '') -> remove null bytes | |
| if is_text_column: | |
| col_expr = f'REPLACE(REPLACE({col_expr}, CHAR(160), \' \'), CHAR(0), \'\')' |
Copilot
AI
Feb 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL injection vulnerability in the AS clause. The source column name is used directly in quotes without sanitization. If a malicious column name contains double quotes or other SQL special characters, this could be exploited. Consider using SQLAlchemy's identifier quoting mechanism or validating/sanitizing column names.
Copilot
AI
Feb 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar SQL injection vulnerability as line 72. The column identifier uses double quotes directly from the source column name in both the TRIM function and the AS clause. Consider using SQLAlchemy's identifier quoting mechanism or validating/sanitizing column names.
Copilot
AI
Feb 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to line 72, this column identifier uses double quotes directly from the source column name without sanitization, which could lead to SQL injection if the source column name contains malicious content. Consider using SQLAlchemy's identifier quoting mechanism or validating/sanitizing column names.
Copilot
AI
Feb 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When no source profile is selected (src_db_type is None), the charset selection defaults to "utf8 (Default)" and "latin1 (Raw Bytes)" options. However, the charset_map on line 352 maps "utf8 (Default)" to None, which might not be appropriate for all database types. When a user later selects a source profile (e.g., MySQL), they won't see MySQL-specific options unless they change the selection again. Consider disabling charset selection until a source profile is selected or updating the options dynamically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition checks if 'charset' is in the list, but 'charset' could be None when default charsets are used (lines 59). This means the TDS version will never be set when using default charset, even if it might be beneficial for older SQL Server versions. Consider checking the actual value of 'mssql_charset' instead of 'charset', or explicitly handle the None case.