Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,14 @@ __pycache__/
data/
tmpclaude-*

# Benchmark data policy: commit manifests/schemas/thresholds, keep raw/private data local.
benchmarks/raw/
benchmarks/private/
benchmarks/**/*.pdf
benchmarks/**/*.xlsx
benchmarks/**/*.xls
benchmarks/**/labels.private.json

# Example runtime artifacts
examples/**/output/
examples/**/fixtures/*.pdf
Expand Down
36 changes: 23 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
# Jetbot

Jetbot is a financial report analysis platform that turns PDF filings into structured financial statements, key notes, risk signals, event-study outputs, and trader-style summaries. It combines PDF extraction, validation, LLM orchestration, a FastAPI backend, and a Vue dashboard in one repository.
Jetbot is a Filing-to-Model Copilot and Financial Fact Platform for evidence-backed financial report extraction. It turns PDF filings into canonical financial facts, structured statements, key notes, risk signals, event-study outputs, and analyst-ready summaries.

It is designed for teams that need a single workflow to ingest reports, inspect extracted evidence, and ship the results through an API, a CLI, or a browser UI.
It is designed for teams that need a single workflow to ingest reports, inspect source evidence, review and correct extracted facts, and ship the results through an API, a CLI, exports, or a browser UI.

## Highlights

- End-to-end PDF pipeline for raw text, tables, statements, notes, and report generation.
- End-to-end PDF pipeline for raw text, tables, statements, notes, facts, and report generation.
- Canonical financial fact layer with page/table/cell evidence metadata for review and downstream exports.
- Evaluation runner with machine-readable reports and configurable quality thresholds.
- Works in mock mode out of the box, with optional OpenAI and Anthropic model routing.
- Vue 3 dashboard for reviewing original PDFs alongside extraction and analysis outputs.
- Docker-first local stack with API, worker, Redis, PostgreSQL, and MinIO.
Expand All @@ -17,13 +19,13 @@ It is designed for teams that need a single workflow to ingest reports, inspect

```mermaid
flowchart LR
A[Financial PDF] --> B[PDF extraction and OCR]
B --> C[Normalization and validation]
C --> D[LLM enrichment and report generation]
C --> E[Risk signals and event study]
D --> F[FastAPI and CLI]
E --> F
F --> G[Vue dashboard at /ui]
A[Financial Filing PDF] --> B[PDF extraction and OCR]
B --> C[Statements and canonical facts]
C --> D[Evidence and validation]
D --> E[Review, API, and exports]
D --> F[Risk signals and analyst reports]
E --> G[Vue dashboard at /ui]
F --> G
```

## Quick Start
Expand Down Expand Up @@ -81,7 +83,7 @@ After startup, the main entry points are:
| Surface | URL / Command | Notes |
| --- | --- | --- |
| Web UI | `http://127.0.0.1:18000/ui/` | Review uploaded PDFs, tables, statements, signals, and generated reports |
| API | `http://127.0.0.1:18000/v1` | Programmatic ingestion and retrieval |
| API | `http://127.0.0.1:18000/v1` | Programmatic ingestion and retrieval, including canonical facts |
| OpenAPI docs | `http://127.0.0.1:18000/docs` | Interactive API explorer |
| Health | `http://127.0.0.1:18000/health` | Liveness probe |
| Metrics | `http://127.0.0.1:18000/metrics` | Prometheus endpoint |
Expand Down Expand Up @@ -153,6 +155,7 @@ pip install -e ".[all]"
```bash
make test
make eval
python scripts/eval.py --thresholds benchmarks/thresholds/golden_minimum.json
make fmt
make lint
make typecheck
Expand All @@ -164,12 +167,19 @@ The repository is organized around a small number of clear surfaces:

- `src/api/` for HTTP entry points and application wiring
- `src/pdf/` for extraction, rendering, tables, and OCR
- `src/finance/` for schemas, normalization, validation, and signal logic
- `src/finance/` for facts, normalization, validation, and signal logic
- `src/agent/` for pipeline orchestration and state handling
- `src/market/` for event-study analysis and market providers
- `web/` for the Vue 3 dashboard
- `tests/` for API, storage, pipeline, frontend-adjacent, and integration coverage
- `docs/` for architecture, branch protection, and project notes
- `benchmarks/` for benchmark manifest schemas, threshold configs, and non-sensitive sample manifests
- `docs/` for architecture, branch protection, roadmap, and project notes

## Benchmark Data Policy

Benchmark manifests, anonymized labels, synthetic fixtures, schemas, and threshold configs can be committed. Raw third-party or proprietary PDFs, private labels, customer files, and generated benchmark artifacts must stay out of git.

Use `benchmarks/raw/` or `benchmarks/private/` for local-only datasets. Those paths are ignored by git. Store only stable metadata, expected facts, expected evidence pointers, and licensing notes in committed manifests.

## Contributing

Expand Down
27 changes: 27 additions & 0 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Benchmark Manifests

This directory stores committed benchmark metadata for Jetbot evaluation. It is for manifests, schemas, anonymized labels, synthetic fixtures, and quality threshold configs only.

Do commit:

- `manifest.schema.json`
- anonymized benchmark manifests
- synthetic fixture metadata
- expected facts, expected evidence pointers, expected note/risk labels
- threshold configs under `thresholds/`

Do not commit:

- raw third-party or proprietary PDFs
- private customer reports
- non-anonymized analyst labels
- generated eval outputs
- files under `benchmarks/raw/` or `benchmarks/private/`

Run the current golden evaluation gate with:

```bash
python scripts/eval.py --thresholds benchmarks/thresholds/golden_minimum.json
```

Real PDF benchmark manifests should point to local-only files through relative paths such as `raw/company-2025-10k.pdf`. Those raw files are intentionally ignored by git.
96 changes: 96 additions & 0 deletions benchmarks/manifest.schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://github.com/magic-alt/jetbot/benchmarks/manifest.schema.json",
"title": "Jetbot Benchmark Manifest",
"type": "object",
"additionalProperties": false,
"required": ["schema_version", "benchmark_id", "name", "cases"],
"properties": {
"schema_version": {"type": "integer", "const": 1},
"benchmark_id": {"type": "string", "minLength": 1},
"name": {"type": "string", "minLength": 1},
"description": {"type": "string"},
"data_policy": {
"type": "object",
"additionalProperties": false,
"required": ["raw_files_committed", "label_policy"],
"properties": {
"raw_files_committed": {"type": "boolean", "const": false},
"label_policy": {"type": "string", "enum": ["synthetic", "anonymized", "private"]},
"notes": {"type": "string"}
}
},
"cases": {
"type": "array",
"minItems": 1,
"items": {"$ref": "#/$defs/case"}
}
},
"$defs": {
"case": {
"type": "object",
"additionalProperties": false,
"required": ["case_id", "source", "expected_facts"],
"properties": {
"case_id": {"type": "string", "minLength": 1},
"company": {"type": "string"},
"ticker": {"type": "string"},
"filing_type": {"type": "string"},
"period_end": {"type": "string", "format": "date"},
"source": {
"type": "object",
"additionalProperties": false,
"required": ["type", "path"],
"properties": {
"type": {"type": "string", "enum": ["synthetic", "pdf", "html", "xbrl"]},
"path": {"type": "string", "minLength": 1},
"license": {"type": "string"},
"sha256": {"type": "string"}
}
},
"expected_facts": {
"type": "array",
"items": {"$ref": "#/$defs/fact"}
},
"expected_notes": {
"type": "array",
"items": {"type": "string"}
},
"expected_risk_categories": {
"type": "array",
"items": {"type": "string"}
}
}
},
"fact": {
"type": "object",
"additionalProperties": false,
"required": ["statement_type", "concept", "value"],
"properties": {
"statement_type": {"type": "string", "enum": ["income", "balance", "cashflow", "note", "other"]},
"concept": {"type": "string", "minLength": 1},
"label": {"type": "string"},
"value": {"type": "number"},
"unit": {"type": "string"},
"currency": {"type": "string"},
"period_end": {"type": "string", "format": "date"},
"evidence": {
"type": "array",
"items": {"$ref": "#/$defs/evidence"}
}
}
},
"evidence": {
"type": "object",
"additionalProperties": false,
"required": ["page"],
"properties": {
"page": {"type": "integer", "minimum": 1},
"table_id": {"type": "string"},
"row": {"type": "integer", "minimum": 0},
"col": {"type": "integer", "minimum": 0},
"quote": {"type": "string"}
}
}
}
}
41 changes: 41 additions & 0 deletions benchmarks/sample_manifest.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
{
"schema_version": 1,
"benchmark_id": "synthetic-smoke-v1",
"name": "Synthetic Smoke Benchmark",
"description": "A committed example manifest showing the expected shape for benchmark metadata. It does not reference real proprietary files.",
"data_policy": {
"raw_files_committed": false,
"label_policy": "synthetic",
"notes": "Use synthetic or anonymized labels in git. Keep real PDFs under ignored local paths."
},
"cases": [
{
"case_id": "synthetic-income-001",
"company": "Example Co",
"ticker": "EXM",
"filing_type": "10-Q",
"period_end": "2025-12-31",
"source": {
"type": "synthetic",
"path": "tests/golden/conftest.py",
"license": "synthetic"
},
"expected_facts": [
{
"statement_type": "income",
"concept": "revenue",
"label": "Revenue",
"value": 100.0,
"unit": "USD millions",
"currency": "USD",
"period_end": "2025-12-31",
"evidence": [
{"page": 1, "quote": "Revenue 100"}
]
}
],
"expected_notes": ["other"],
"expected_risk_categories": []
}
]
}
12 changes: 12 additions & 0 deletions benchmarks/thresholds/golden_minimum.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"schema_version": 1,
"description": "Initial non-regression thresholds for the synthetic golden suite. Tighten these as extraction quality improves.",
"min_metrics": {
"n_cases": 5,
"avg_source_ref_completeness": 1.0,
"avg_signal_category_recall": 0.8,
"avg_note_type_recall": 0.6,
"avg_fact_value_accuracy": 0.08,
"avg_fact_source_ref_completeness": 0.34
}
}
22 changes: 12 additions & 10 deletions docs/financial_fact_platform_roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ Jetbot 当前已经具备较完整的财报 PDF Agent MVP 能力:PDF 上传、

## 4. 已完成的第一实现切片

本路线图的第一切片已经在当前分支 `feat/financial-fact-foundation` 中实现,目标是为后续人工复核、导出和 benchmark 建立事实层底座。
本路线图的第一切片已通过 PR12 合并到 `main`,目标是为后续人工复核、导出和 benchmark 建立事实层底座。

### 4.1 Schema 与证据模型

Expand Down Expand Up @@ -162,7 +162,7 @@ Jetbot 当前已经具备较完整的财报 PDF Agent MVP 能力:PDF 上传、

- 文档和 README 中明确 Jetbot 的下一阶段定位。
- 每个 P0 feature 都能映射到质量指标。
- 不把真实敏感 PDF 提交到仓库。
- 不把真实敏感 PDF 提交到仓库;真实样本只保存在本地或私有存储,仓库只提交 manifest、匿名标签、合成 fixture、schema 和阈值配置

### Phase 1:Benchmark 与 Eval CI,Week 1-2

Expand Down Expand Up @@ -192,6 +192,7 @@ Jetbot 当前已经具备较完整的财报 PDF Agent MVP 能力:PDF 上传、
验收标准:

- `python scripts/eval.py --output-dir data/eval-dev` 可生成报告。
- `python scripts/eval.py --thresholds benchmarks/thresholds/golden_minimum.json` 可作为质量门槛,指标低于阈值时返回非 0。
- 报告包含 document-level 与 aggregate metrics。
- synthetic golden gate 可稳定在 CI 中运行。
- real PDF benchmark 可本地运行,且不会把敏感样本提交到 git。
Expand Down Expand Up @@ -574,13 +575,14 @@ docker compose up --build

## 10. 下一步推荐执行顺序

1. 完成 correction API 和 effective facts。
2. 在前端增加 facts tab 或 review panel。
3. 给 `PdfViewer` 增加 bbox overlay。
4. 给 `EvidenceLink` 增加 row/col/bbox payload。
5. 增加 Excel/CSV/JSON export。
6. 扩展 benchmark manifest 和 threshold gate。
7. 开始 table router protocol。
8. 再接 SEC/XBRL/HTML ingestion。
1. 收口 Phase 0:README/路线图正式定位为 Filing-to-Model Copilot / Financial Fact Platform,并文档化 benchmark 数据政策。
2. 完成 Phase 1 评测门槛:benchmark manifest schema、样例 manifest、threshold 配置和 eval gate。
3. 完成 correction API 和 effective facts。
4. 在前端增加 facts tab 或 review panel。
5. 给 `PdfViewer` 增加 bbox overlay。
6. 给 `EvidenceLink` 增加 row/col/bbox payload。
7. 增加 Excel/CSV/JSON export。
8. 开始 table router protocol。
9. 再接 SEC/XBRL/HTML ingestion。

这一路线的判断标准很简单:每增加一个能力,都必须让 facts 更准确、证据更可审计、复核更省时间、输出更能进入真实 analyst workflow。
Loading
Loading