Skip to content

Conversation

@valuecodes
Copy link
Owner

What

This update introduces a command-line interface (CLI) for ETF backtesting, along with significant enhancements to the agent runner. The new CLI allows users to fetch and cache ETF data, while the agent runner now supports stateless execution and improved logging. The changes streamline the configuration and execution of agents, making it easier to manage and log results.

  • Added EtfDataFetcher class for fetching and caching ETF data.

  • Enhanced AgentRunner to manage agent execution with improved logging and event handling.

  • Updated CLI to include new arguments for ETF backtest usage.

  • Refactored logging structure for clarity and consistency across multiple files.

  • Removed legacy scripts and unused constants to clean up the codebase.

How to test

  • Run pnpm run:etf-backtest with appropriate arguments to test the new CLI functionality.

  • Execute pnpm test to ensure all tests pass.

  • Use pnpm lint to check for any linting issues.

  • Validate the logging output during agent execution to confirm improvements.

Security review

  • Secrets / env vars: not changed.

  • Auth / session: not changed.

  • Network / API calls: changed. (New API calls for fetching ETF data.)

  • Data handling / PII: changed. (Enhanced logging may include user-provided data.)

  • Dependencies: not changed.

No security-impacting changes identified.

  • No new dependencies and no network calls beyond existing functionality.

  • No env var changes and no auth/session logic touched.

- Add AgentRunner class to manage agent execution and logging
- Create utility functions for result formatting and extraction
- Refactor existing code to utilize the new AgentRunner
- Remove AGENT_NAME and MODEL_NAME constants
- Update model to "gpt-5-mini" in AgentRunner
- Format PYTHON_BINARY path for better readability
- Implement stateless option for independent runs
- Update logging for agent execution results
- Refactor tests to align with new run method signature
- Replace existing tools with factory functions that accept logger
- Update tests to utilize new tool creation methods
- Replace template literals in logger calls with structured objects
- Improve log messages for better readability and consistency
- Update logger calls in various tools and main files
- Add EtfDataFetcher class for fetching and caching ETF data
- Update main logic to integrate ETF data fetching
- Enhance CLI argument parsing to include ISIN and refresh options
- Eliminate DEFAULT_TICKER constant and related parsing
- Update run_experiment function to remove ticker parameter
- Clarify CLI arguments and usage for ETF backtest
- Update data fetching and caching details in documentation
- Modify logging method in final report utility
- Add LearningsManager class for managing learnings persistence
- Introduce learnings schema and formatter for prompt-friendly summaries
- Update main logic to utilize learnings during optimization runs
- Add usage of AgentRunner as default wrapper for agents
- Include flowchart for ETF backtest process
- Add demo image for ETF backtest
- Update imports in etf-data-fetcher.ts to reflect new schema location
- Delete outdated etf-data.ts file
Copilot AI review requested due to automatic review settings January 27, 2026 07:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a comprehensive ETF backtesting CLI with significant infrastructure improvements to support agent-based feature optimization. The changes migrate logging to a structured format, refactor tools to use a factory pattern with logger injection, and add a new AgentRunner abstraction for consistent agent execution across CLIs.

Changes:

  • Added AgentRunner class for managing agent execution with improved logging and stateless session support for reasoning models
  • Migrated all logging from template literals to structured logging with data objects (enforced via ESLint rule)
  • Refactored tools (writeFile, readFile, listFiles, fetchUrl) to use factory pattern accepting logger instances
  • Added new runPython tool for executing Python scripts with JSON stdin support
  • Implemented ETF backtest CLI with data fetching, learnings persistence, and Python-based ML experiments
  • Enhanced PlaywrightScraper with network capture capability for API interception
  • Added comprehensive test coverage for all new features

Reviewed changes

Copilot reviewed 41 out of 43 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/clients/agent-runner.ts New abstraction for agent execution with event handling, stateless mode, and structured logging
src/clients/agent-runner.test.ts Comprehensive tests for AgentRunner including event deduplication and configuration
src/clients/playwright-scraper.ts Added scrapeWithNetworkCapture method for intercepting API responses with localStorage support
src/tools/*/ Refactored all tools to factory pattern with logger injection for consistent structured logging
src/cli/etf-backtest/ Complete ETF backtest CLI implementation with data fetching, learnings management, and optimization
src/cli/etf-backtest/scripts/ Python scripts for ML-based feature engineering and backtesting
src/utils/question-handler.ts Migrated to structured logging
src/utils/parse-args.ts Migrated to structured logging
eslint.config.ts Added rules to enforce structured logging (no template literals in logger calls)
README.md, AGENTS.md, agent/PLANS.md Updated documentation for new features and conventions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a7362b926e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 200 to 201
currentPrompt =
"You ran too many experiments in one turn. Please run exactly ONE experiment, then respond with your JSON analysis.";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve dataPath/seed in recovery prompts

Because AgentRunner is configured with stateless: true, each iteration only receives the static instructions plus currentPrompt. The recovery prompt here omits the concrete dataPath and seed values that the tool call requires, so the next run has no way to supply real inputs (the instructions only mention placeholders like <seed>/<dataPath>). In practice this makes the error-recovery path fail: the agent will call runPython with placeholders or no path, and the script falls back to a non-existent default path (tmp/etf-backtest/data.json rather than the per-ISIN cache). The same omission happens in the invalid-JSON branch a few lines below, so any parse error or max-turns error leads to a stuck loop.

Useful? React with 👍 / 👎.

Comment on lines +14 to +16
// Match performance-chart requests with full historical data (dateFrom before 2020)
export const getEtfApiPattern = (isin: string): RegExp =>
new RegExp(`/api/etfs/${isin}/performance-chart.*dateFrom=(19|200|201)`);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match 202x dateFrom values in API capture regex

The capture regex hard-codes dateFrom to (19|200|201), which excludes 202x. For ETFs launched in the 2020s, justetf’s performance-chart request typically uses dateFrom=2021/2022 etc., so the pattern won’t match and scrapeWithNetworkCapture will time out. That makes EtfDataFetcher.fetch() fail for newer ETFs even though the data exists. Broadening the pattern (e.g., 20\d\d or removing the constraint) avoids this class of failures.

Useful? React with 👍 / 👎.

- Implement buildRunPythonUsage and buildRecoveryPrompt functions
- Update main agent logic to utilize new prompt builders
- Add tests for prompt builder functions
@valuecodes valuecodes merged commit dc3d596 into main Jan 27, 2026
4 checks passed
@valuecodes valuecodes deleted the feat/etf-backtest branch January 27, 2026 08:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants