-
Notifications
You must be signed in to change notification settings - Fork 0
Source Data Deep Dive
Syed Ibrahim Omer edited this page Apr 13, 2026
·
1 revision
This page documents how source_data() in src/indicators.py turns Yahoo Finance history into per-ticker Polars lazy frames.
source_data(tickers, period, timeframe)-
tickers: list of symbols (afterrun_mainparsing). -
period: a single period string for this call (e.g.1y,5y). -
timeframe: either a Yahoo interval string (1d,1wk, …) or a dict mapping period → interval when using a timeframe JSON file.
- Uses
yf.Tickers(tickers).history(period=period, interval=...). - If
timeframeis a dict, the interval used istimeframe[period](same period as the outer loop inrun_main).
- The result is
reset_index()then converted withpl.from_pandas(..., schema_overrides=schema).lazy(). - On conversion failure, the function prints and returns
[](empty list), not an empty lazy frame.
yfinance returns a MultiIndex column layout for multiple tickers. The code:
- Collects schema names from the lazy frame.
- For each requested ticker, checks for a column named like
('Close', 'AAPL')(string form in schema:('Close', 'AAPL')). - If that column is missing, the ticker is skipped with a log line.
- For valid tickers, it selects only the columns that exist and renames them to simple names:
Close,High,Low,Open,Volume,Dividends,Stock Splits, plusDate.
Returns a list of dict “packages”, one per valid ticker:
| Field | Meaning |
|---|---|
data |
LazyFrame with columns Date, Close, … (before lowercasing in calculate_indicators) |
ticker |
symbol |
period |
the period string used for this fetch |
-
collect_schema()may be called multiple times per ticker during validation and selection (cost scales with schema complexity). - One combined
history()call fetches all tickers for that(period, interval)pair.
| Symptom | Likely cause |
|---|---|
| Empty return list | Conversion exception, or no tickers had Close data |
| Some tickers missing | Delisted, bad symbol, or no data for that period/interval |
Related pages:
- Getting Started
- CLI Reference
- Configuration & Templates
- Indicators (Overview)
- Output Formats
- Advanced Usage
- Troubleshooting
- Pipeline
- CLI Parsing
- Data Source (Yahoo Finance)
- Source Data Deep Dive
- Schema Normalization
- Data Shape Invariants
- Output Writing
- Write Output Deep Dive
- Config Resolution
- Polars Engine
- Source Modules
- Testing
- Performance
- Indicators Engine
- Reproducibility