feat: implement ETF backtest CLI #8

valuecodes · 2026-01-27T07:48:15Z

What

This update introduces a command-line interface (CLI) for ETF backtesting, along with significant enhancements to the agent runner. The new CLI allows users to fetch and cache ETF data, while the agent runner now supports stateless execution and improved logging. The changes streamline the configuration and execution of agents, making it easier to manage and log results.

Added EtfDataFetcher class for fetching and caching ETF data.
Enhanced AgentRunner to manage agent execution with improved logging and event handling.
Updated CLI to include new arguments for ETF backtest usage.
Refactored logging structure for clarity and consistency across multiple files.
Removed legacy scripts and unused constants to clean up the codebase.

How to test

Run pnpm run:etf-backtest with appropriate arguments to test the new CLI functionality.
Execute pnpm test to ensure all tests pass.
Use pnpm lint to check for any linting issues.
Validate the logging output during agent execution to confirm improvements.

Security review

Secrets / env vars: not changed.
Auth / session: not changed.
Network / API calls: changed. (New API calls for fetching ETF data.)
Data handling / PII: changed. (Enhanced logging may include user-provided data.)
Dependencies: not changed.

No security-impacting changes identified.

No new dependencies and no network calls beyond existing functionality.
No env var changes and no auth/session logic touched.

- Add AgentRunner class to manage agent execution and logging - Create utility functions for result formatting and extraction - Refactor existing code to utilize the new AgentRunner

- Remove AGENT_NAME and MODEL_NAME constants - Update model to "gpt-5-mini" in AgentRunner - Format PYTHON_BINARY path for better readability

- Implement stateless option for independent runs - Update logging for agent execution results - Refactor tests to align with new run method signature

- Replace existing tools with factory functions that accept logger - Update tests to utilize new tool creation methods

- Replace template literals in logger calls with structured objects - Improve log messages for better readability and consistency - Update logger calls in various tools and main files

- Add EtfDataFetcher class for fetching and caching ETF data - Update main logic to integrate ETF data fetching - Enhance CLI argument parsing to include ISIN and refresh options

- Eliminate DEFAULT_TICKER constant and related parsing - Update run_experiment function to remove ticker parameter

- Clarify CLI arguments and usage for ETF backtest - Update data fetching and caching details in documentation - Modify logging method in final report utility

- Add LearningsManager class for managing learnings persistence - Introduce learnings schema and formatter for prompt-friendly summaries - Update main logic to utilize learnings during optimization runs

- Add usage of AgentRunner as default wrapper for agents - Include flowchart for ETF backtest process - Add demo image for ETF backtest

- Update imports in etf-data-fetcher.ts to reflect new schema location - Delete outdated etf-data.ts file

Copilot

Pull request overview

This PR introduces a comprehensive ETF backtesting CLI with significant infrastructure improvements to support agent-based feature optimization. The changes migrate logging to a structured format, refactor tools to use a factory pattern with logger injection, and add a new AgentRunner abstraction for consistent agent execution across CLIs.

Changes:

Added AgentRunner class for managing agent execution with improved logging and stateless session support for reasoning models
Migrated all logging from template literals to structured logging with data objects (enforced via ESLint rule)
Refactored tools (writeFile, readFile, listFiles, fetchUrl) to use factory pattern accepting logger instances
Added new runPython tool for executing Python scripts with JSON stdin support
Implemented ETF backtest CLI with data fetching, learnings persistence, and Python-based ML experiments
Enhanced PlaywrightScraper with network capture capability for API interception
Added comprehensive test coverage for all new features

Reviewed changes

Copilot reviewed 41 out of 43 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
`src/clients/agent-runner.ts`	New abstraction for agent execution with event handling, stateless mode, and structured logging
`src/clients/agent-runner.test.ts`	Comprehensive tests for AgentRunner including event deduplication and configuration
`src/clients/playwright-scraper.ts`	Added `scrapeWithNetworkCapture` method for intercepting API responses with localStorage support
`src/tools/*/`	Refactored all tools to factory pattern with logger injection for consistent structured logging
`src/cli/etf-backtest/`	Complete ETF backtest CLI implementation with data fetching, learnings management, and optimization
`src/cli/etf-backtest/scripts/`	Python scripts for ML-based feature engineering and backtesting
`src/utils/question-handler.ts`	Migrated to structured logging
`src/utils/parse-args.ts`	Migrated to structured logging
`eslint.config.ts`	Added rules to enforce structured logging (no template literals in logger calls)
`README.md`, `AGENTS.md`, `agent/PLANS.md`	Updated documentation for new features and conventions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a7362b926e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-01-27T07:52:54Z

src/cli/etf-backtest/main.ts

+        currentPrompt =
+          "You ran too many experiments in one turn. Please run exactly ONE experiment, then respond with your JSON analysis.";


Preserve dataPath/seed in recovery prompts

Because AgentRunner is configured with stateless: true, each iteration only receives the static instructions plus currentPrompt. The recovery prompt here omits the concrete dataPath and seed values that the tool call requires, so the next run has no way to supply real inputs (the instructions only mention placeholders like <seed>/<dataPath>). In practice this makes the error-recovery path fail: the agent will call runPython with placeholders or no path, and the script falls back to a non-existent default path (tmp/etf-backtest/data.json rather than the per-ISIN cache). The same omission happens in the invalid-JSON branch a few lines below, so any parse error or max-turns error leads to a stuck loop.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-01-27T07:52:54Z

src/cli/etf-backtest/constants.ts

+// Match performance-chart requests with full historical data (dateFrom before 2020)
+export const getEtfApiPattern = (isin: string): RegExp =>
+  new RegExp(`/api/etfs/${isin}/performance-chart.*dateFrom=(19|200|201)`);


Match 202x dateFrom values in API capture regex

The capture regex hard-codes dateFrom to (19|200|201), which excludes 202x. For ETFs launched in the 2020s, justetf’s performance-chart request typically uses dateFrom=2021/2022 etc., so the pattern won’t match and scrapeWithNetworkCapture will time out. That makes EtfDataFetcher.fetch() fail for newer ETFs even though the data exists. Broadening the pattern (e.g., 20\d\d or removing the constraint) avoids this class of failures.

Useful? React with 👍 / 👎.

- Implement buildRunPythonUsage and buildRecoveryPrompt functions - Update main agent logic to utilize new prompt builders - Add tests for prompt builder functions

valuecodes added 16 commits January 26, 2026 10:11

feat: add etf backtest cli

b2dc6d8

feat: implement agent runner with logging and event handling

6b57331

- Add AgentRunner class to manage agent execution and logging - Create utility functions for result formatting and extraction - Refactor existing code to utilize the new AgentRunner

refactor: simplify agent runner configuration and update model name

18d66db

- Remove AGENT_NAME and MODEL_NAME constants - Update model to "gpt-5-mini" in AgentRunner - Format PYTHON_BINARY path for better readability

refactor: remove unused constants and clean up code

04886e3

refactor: update agent runner to accept prompt in options object

a9b6ee7

feat: enhance agent runner with stateless execution and logging

6c459a6

- Implement stateless option for independent runs - Update logging for agent execution results - Refactor tests to align with new run method signature

refactor: streamline tool creation with logger integration

51485a9

- Replace existing tools with factory functions that accept logger - Update tests to utilize new tool creation methods

refactor: enhance logging structure and clarity across multiple files

40bcff9

- Replace template literals in logger calls with structured objects - Improve log messages for better readability and consistency - Update logger calls in various tools and main files

feat: implement ETF data fetching with caching and logging

d307a4e

- Add EtfDataFetcher class for fetching and caching ETF data - Update main logic to integrate ETF data fetching - Enhance CLI argument parsing to include ISIN and refresh options

chore: remove legacy backtest and prediction scripts

cbcf4e2

refactor: remove ticker references from CLI and related scripts

6d48ede

- Eliminate DEFAULT_TICKER constant and related parsing - Update run_experiment function to remove ticker parameter

docs: update README and agent documentation for ETF backtest usage

d19ff50

- Clarify CLI arguments and usage for ETF backtest - Update data fetching and caching details in documentation - Modify logging method in final report utility

feat: implement learnings manager for ETF backtest optimization

9eb364e

- Add LearningsManager class for managing learnings persistence - Introduce learnings schema and formatter for prompt-friendly summaries - Update main logic to utilize learnings during optimization runs

refactor: remove REASONING_PREVIEW_LIMIT constant and update usage

e3ae01b

docs: update AGENTS and ETF backtest README for clarity and new features

1d9adec

- Add usage of AgentRunner as default wrapper for agents - Include flowchart for ETF backtest process - Add demo image for ETF backtest

refactor: move ETF data schemas to schemas.ts and remove old types

a7362b9

- Update imports in etf-data-fetcher.ts to reflect new schema location - Delete outdated etf-data.ts file

Copilot AI review requested due to automatic review settings January 27, 2026 07:48

Copilot started reviewing on behalf of valuecodes January 27, 2026 07:48 View session

Copilot AI reviewed Jan 27, 2026

View reviewed changes

test: update tool logging in AgentRunner tests for clarity

d5b92a4

chatgpt-codex-connector bot reviewed Jan 27, 2026

View reviewed changes

valuecodes added 2 commits January 27, 2026 09:57

feat: add prompt builders for runPython usage and recovery messages

36165ee

- Implement buildRunPythonUsage and buildRecoveryPrompt functions - Update main agent logic to utilize new prompt builders - Add tests for prompt builder functions

refactor: remove redundant Python script tests from run-python-tool

3656fd2

valuecodes merged commit dc3d596 into main Jan 27, 2026
4 checks passed

valuecodes deleted the feat/etf-backtest branch January 27, 2026 08:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement ETF backtest CLI #8

feat: implement ETF backtest CLI #8

Uh oh!

valuecodes commented Jan 27, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Jan 27, 2026

Uh oh!

chatgpt-codex-connector bot Jan 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		currentPrompt =
		"You ran too many experiments in one turn. Please run exactly ONE experiment, then respond with your JSON analysis.";

feat: implement ETF backtest CLI #8

feat: implement ETF backtest CLI #8

Uh oh!

Conversation

valuecodes commented Jan 27, 2026

What

How to test

Security review

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants