AlphaFin/indicators/db_utils.py:53 builds a SQL string by direct f-string interpolation of fields, table_name, and start_date, all of which originate from caller code that is, in turn, driven by indicator configurations and (in some code paths) values traceable back to user input via the agent tool surface.
Steps to Reproduce
- Call
get_data_by_sql(file_path, db_name, table_name, codes, fields, start_date) with a start_date value such as "20200101' OR 1=1 --".
- The constructed SQL becomes:
select ... from ... where ts_code in (...) and trade_date >= '20200101' OR 1=1 --'
- SQLite happily parses and executes the broken predicate, returning unfiltered rows.
Code
sql = f"select {fields} from {table_name} where ts_code in {tuple(codes)}"
if start_date:
date_col = 'end_date' if table_name in _FINANCIAL_TABLES else 'trade_date'
sql += f" and {date_col} >= '{start_date}'"
Expected vs Actual
- Expected: Parameter values are passed via
pd.read_sql(sql, engine, params=...) placeholders so the SQL planner can never confuse data with code.
- Actual: All three of
fields, table_name, and start_date are spliced directly into the query.
Suggested Fix
Switch to parameterized queries for start_date, and validate table_name / fields against an allow-list before interpolation. Even if today's callers all pass safe constants, the function is exported and may be called from agent-driven contexts in the future.
Environment
- File:
AlphaFin/indicators/db_utils.py:41-62
- Severity: High
AlphaFin/indicators/db_utils.py:53builds a SQL string by direct f-string interpolation offields,table_name, andstart_date, all of which originate from caller code that is, in turn, driven by indicator configurations and (in some code paths) values traceable back to user input via the agent tool surface.Steps to Reproduce
get_data_by_sql(file_path, db_name, table_name, codes, fields, start_date)with astart_datevalue such as"20200101' OR 1=1 --".Code
Expected vs Actual
pd.read_sql(sql, engine, params=...)placeholders so the SQL planner can never confuse data with code.fields,table_name, andstart_dateare spliced directly into the query.Suggested Fix
Switch to parameterized queries for
start_date, and validatetable_name/fieldsagainst an allow-list before interpolation. Even if today's callers all pass safe constants, the function is exported and may be called from agent-driven contexts in the future.Environment
AlphaFin/indicators/db_utils.py:41-62