Skip to content

Releases: agent-fox-dev/agent-fox

v3.3.1

23 Apr 12:37

Choose a tag to compare

What's New

Features

  • Test coverage regression gate — measures per-file line coverage before and after coder sessions; blocks the task if coverage decreases on modified files (#520)
    • Multi-language coverage tool detection: pytest-cov (Python), cargo-tarpaulin (Rust), go test -cover (Go)
    • Coverage data stored in session outcomes for trend tracking (migration v20)
    • Blocking findings emitted via review_findings table on regression

Full Changelog: v3.3.0...v3.3.1

v3.3.0

23 Apr 11:55

Choose a tag to compare

What's New

Features

  • Verification checklist & task completion enforcement — structured verification checklist for spec compliance (#521)
  • State transition validation in GraphSync — validates engine state transitions to catch illegal graph moves (#523)
  • Eager pre-review with retry-predecessor — restores eager pre-review behavior with retry on predecessor failure (#519)
  • Lightweight errata generation from blocking — reinstates errata generation when issues are blocked (#522)
  • Knowledge system pruning — migration v18 removes causal links and dead knowledge modules (spec 116)

Bug Fixes

  • Fix max_items in property test to avoid retrieval cap masking failures
  • Use Path-typed specs_path variable in plan_cmd (#516)
  • Fix ruff format violation in RuntimeError f-string (#515)
  • Add proper type annotations for embedder and backend variables (#514)

Refactoring

  • Extract strategy classes from engine, fix_pipeline, and result_handler (#518)
  • Inline single-consumer modules and deduplicate review parser
  • Remove dead code and consolidate single-consumer modules (2 passes)
  • Remove dead code and consolidate near-identical abstractions
  • Delete dead knowledge modules (blocking_history, errata_store, gotcha_extraction, gotcha_store) and simplify provider

Full Changelog: v3.2.0...v3.3.0

v3.2.0

22 Apr 14:56

Choose a tag to compare

What's Changed

Features

  • knowledge: Decouple knowledge subsystem via KnowledgeProvider protocol (spec 114)
  • knowledge: Pluggable knowledge provider with gotcha extraction, errata store, and content hashing (spec 115)
  • engine: Wire FoxKnowledgeProvider into engine startup

Refactors

  • knowledge: Delete 40+ legacy knowledge pipeline modules (lang analyzers, retrieval, consolidation, embeddings, etc.)
  • config: Remove obsolete knowledge pipeline configuration options
  • cli: Remove onboard command and legacy nightshift streams

Chores

  • Supersede specs 112 (sleep time compute) and 113 (knowledge effectiveness)
  • Fix Unicode edge case in content hash determinism property test
  • Clean up leftover __pycache__ directories in deleted knowledge subdirectories

v3.1.4

22 Apr 08:34

Choose a tag to compare

What's Changed

Bug Fixes

  • engine: Close AsyncAnthropic clients to prevent event loop shutdown crash (fixes #506)
  • engine: Skip redundant cleanup ingestion when barrier already ran (fixes #505)
  • knowledge: Always write agent trace JSONL for transcript reconstruction (fixes #507)
  • Guard trace reconstruction behind debug flag to suppress spurious warning

Features

  • engine: Add pre-flight check to skip coder sessions when work is done (fixes #511)

Other

  • New specs 114 (knowledge decoupling), 115 (pluggable knowledge)
  • Coding-session architecture documentation
  • General cleanup

Full Changelog: v3.1.3...v3.1.4

v3.1.3

21 Apr 15:54

Choose a tag to compare

What's Changed

Bug Fixes

  • Budget exhaustion detection: Sessions that hit the SDK max-budget-usd limit are now detected and blocked immediately instead of being wastefully retried. The SDK returns is_error=True with no message on budget exhaustion — previously mapped to "Unknown error" and retried through the escalation ladder.
  • AssessmentManager config: Pass full_config (not OrchestratorConfig) to AssessmentManager, fixing missing attribute errors.
  • Escalation ladder starting tier: The escalation ladder now respects config.models.coding for the starting tier instead of always defaulting to STANDARD.
  • Timed-out session metrics: Emit descriptive error messages and metrics for sessions that time out.

Features

  • Knowledge system effectiveness (spec 113): Transcript reconstruction, compaction improvements, entity signal activation, cold-start handling, git extraction, audit consumption, retrieval quality validation, and audit prompt injection.

Other

  • Parking service audit report
  • Session budget increased for lengthy tasks

v3.1.2

21 Apr 03:31

Choose a tag to compare

Bug Fixes

  • engine: Move review concurrency cap before _prepare_launch to prevent phantom retry exhaustion (fixes #503)

    The review concurrency cap in _fill_parallel_pool was checked after _prepare_launch(), which increments the attempt tracker on "allowed" verdicts. When the single review slot was occupied, audit-review tasks were skipped but their attempt counter was already incremented. After max_retries + 1 (default 3) such pool-refill cycles, the circuit breaker permanently blocked the task with "Retry limit exceeded" — without ever starting a session. This cascade-blocked all downstream coding and verifier tasks, exceeding the block budget and halting the entire run.

Recovery for affected runs

If you have a stuck run with audit-review tasks blocked by "Retry limit exceeded", clear the stale state:

agent-fox reset --spec <affected_spec_name>

Full Changelog: v3.1.1...v3.1.2

v3.1.1

20 Apr 14:54

Choose a tag to compare

Bug Fixes

  • reset: clear session-scoped tables on reset to prevent block_limit death-loop (#501)

    After a block_limit run, reset --hard (and soft reset) left stale data in six session-scoped DB tables (runs, session_outcomes, review_findings, verification_results, drift_findings, blocking_history). The stale runs.status='block_limit' caused load_state_from_db() to load a terminal status, making the engine loop exit immediately on every subsequent agent-fox code invocation — a self-perpetuating death-loop with no CLI recovery path.

    All four reset paths (reset_all, reset_task, reset_spec, hard_reset_all/hard_reset_task) now clear session-scoped tables so that plan and code start from a clean state.

Full Changelog: v3.1.0...v3.1.1

v3.1.0

20 Apr 13:40

Choose a tag to compare

What's New

Sleep-Time Compute (Spec 112)

A new knowledge-processing pipeline that runs background computation during idle periods:

  • Core protocol & orchestrator — schema, configuration, and orchestration layer for sleep-time tasks
  • ContextRewriter — sleep task that rewrites and enriches knowledge context
  • BundleBuilder — sleep task that builds consolidated knowledge bundles
  • Retriever & integration wiring — retrieval layer with full integration into the existing knowledge system
  • Wiring verification — end-to-end verification of the sleep-time compute pipeline

Full Changelog

  • feat(112): implement core protocol, orchestrator, config, and schema
  • feat(112): implement ContextRewriter sleep task
  • feat(112): implement retriever and integration wiring
  • test(112): failing spec tests, checkpoint, and wiring verification

v3.0.5

20 Apr 11:29

Choose a tag to compare

What's Changed

Bug Fixes

  • nightshift: exclude .agent-fox/ from onboard file scanning (#499)
  • nightshift: add --specs-dir flag to plan and night-shift commands (#498)
  • nightshift: add progress spinner to onboard command (#497)

Other

  • Updated config

Full Changelog: v3.0.4...v3.0.5

v3.0.4

20 Apr 09:47

Choose a tag to compare

What's Changed

  • fix(nightshift): Triage agent now receives a triage-specific task prompt instead of the coder's "Fix the issue" prompt. This was the root cause of all triage parse failures — the agent would implement the fix instead of producing a JSON triage report.
  • fix(tests): Knowledge wiring tests no longer leak .specs/ directories into the working tree.