Skip to content

Conversation

@valuecodes
Copy link
Owner

What

Introduce functionality for tracking and reporting AI usage for Claude and Codex models. This includes a new CLI command pnpm ai:usage that summarizes token usage and estimated costs based on local logs and pricing configurations. The implementation features a usage pipeline for data collection, log reading, and output formatting. Additionally, it enhances the existing CLI with new commands and improves documentation for user guidance.

  • Implemented log readers for Claude and Codex usage data.

  • Added dynamic column width calculation for output formatting.

  • Created a usage pipeline to orchestrate log reading and report generation.

  • Enhanced token aggregation logic to include cache read/write tokens.

  • Updated README with usage instructions and examples for the new CLI command.

How to test

  • Run pnpm ai:usage to summarize AI usage for the current repository.

  • Use flags like --since to specify time windows or --repo to target specific repositories.

  • Validate the output format and ensure it matches the expected summary and table structure.

Security review

  • Secrets / env vars: not changed.

  • Auth / session: not changed.

  • Network / API calls: not changed.

  • Data handling / PII: not changed.

  • Dependencies: added. (New dependencies for logging and argument parsing.)

No security-impacting changes identified.

  • No new dependencies and no network calls.

  • No env var changes and no auth/session logic touched.

- Implement Claude and Codex log readers for usage data
- Create output formatter for displaying aggregated usage
- Add pricing configuration and usage aggregation logic
- Introduce CLI commands for usage reporting
- Introduce `pnpm ai:usage` command for summarizing token usage
- Update total token calculation to include cache read/write tokens
- Enhance documentation for AI usage features
- Add UsagePipeline class to orchestrate log reading and formatting
- Refactor main CLI logic to utilize the new pipeline for report generation
- Enhance output formatting methods for better string handling
Copilot AI review requested due to automatic review settings January 29, 2026 14:13
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 03e64c3d76

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 220 to 225
// Dedupe: Codex emits duplicate token_count events with identical values
const usageKey = `${usage.input_tokens}:${usage.output_tokens}:${usage.cached_input_tokens}`;
if (usageKey === lastUsageKey) {
continue;
}
lastUsageKey = usageKey;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid dropping distinct Codex calls with same token counts

The dedupe logic treats any consecutive token_count events with identical (input, output, cached) values as duplicates. In real usage, two different Codex calls can legitimately have the same token counts (e.g., repeated prompts or short completions), so the second call will be skipped and its tokens/costs undercounted. This will misstate usage in those scenarios; consider deduping on a stronger signal (timestamp/event id) or only skipping true duplicate records.

Useful? React with 👍 / 👎.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new CLI command pnpm ai:usage for tracking and reporting AI token usage from Claude and Codex models. The implementation reads local log files, aggregates token usage by provider and model, calculates costs based on a pricing configuration, and outputs formatted summaries with tables.

Changes:

  • Added complete AI usage tracking CLI with log readers for Claude and Codex
  • Implemented usage aggregation, cost calculation, and flexible output formatting (text/JSON)
  • Enhanced main README and package.json with new command documentation

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/cli/ai-usage/types/schemas.ts Defines Zod schemas for CLI args, pricing config, log entries, and aggregated usage types
src/cli/ai-usage/main.ts Entry point that parses arguments and executes the usage pipeline
src/cli/ai-usage/constants.ts Default log paths and duration mappings for time window parsing
src/cli/ai-usage/clients/usage-pipeline.ts Orchestrates log reading, aggregation, and formatting; resolves repo paths and loads pricing
src/cli/ai-usage/clients/usage-aggregator.ts Groups usage records by provider/model and calculates costs based on pricing config
src/cli/ai-usage/clients/output-formatter.ts Formats output as summary text, markdown tables, or JSON with dynamic column widths
src/cli/ai-usage/clients/codex-log-reader.ts Parses Codex JSONL logs from date-organized directories, extracts token usage with deduplication
src/cli/ai-usage/clients/claude-log-reader.ts Parses Claude JSONL logs from encoded repo directories, filters by timestamp and cwd
src/cli/ai-usage/ai-usage.pricing.json Model pricing configuration with rates per 1M tokens including cache read/write costs
src/cli/ai-usage/README.md Comprehensive documentation with usage examples, internals, and sample output
package.json Added ai:usage npm script to run the new CLI
README.md Updated main README with AI usage CLI documentation and examples
AGENTS.md Added reference to pnpm ai:usage command in common scripts section

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 8 to 26
"cacheWrite": 10.0,
"cacheWrite_5m": 6.25,
"cacheWrite_1h": 10.0
},
"claude-sonnet-4": {
"input": 3.0,
"output": 15.0,
"cacheRead": 0.3,
"cacheWrite": 6.0,
"cacheWrite_5m": 3.75,
"cacheWrite_1h": 6.0
},
"claude-haiku-4-5": {
"input": 1.0,
"output": 5.0,
"cacheRead": 0.1,
"cacheWrite": 2.0,
"cacheWrite_5m": 1.25,
"cacheWrite_1h": 2.0
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pricing configuration includes fields like cacheWrite_5m and cacheWrite_1h that are not defined in the ModelPricingSchema and are not used by the aggregator. These extra fields should either be removed or the schema should be updated to include them if they represent future planned functionality. Currently they are silently ignored which could lead to confusion.

Suggested change
"cacheWrite": 10.0,
"cacheWrite_5m": 6.25,
"cacheWrite_1h": 10.0
},
"claude-sonnet-4": {
"input": 3.0,
"output": 15.0,
"cacheRead": 0.3,
"cacheWrite": 6.0,
"cacheWrite_5m": 3.75,
"cacheWrite_1h": 6.0
},
"claude-haiku-4-5": {
"input": 1.0,
"output": 5.0,
"cacheRead": 0.1,
"cacheWrite": 2.0,
"cacheWrite_5m": 1.25,
"cacheWrite_1h": 2.0
"cacheWrite": 10.0
},
"claude-sonnet-4": {
"input": 3.0,
"output": 15.0,
"cacheRead": 0.3,
"cacheWrite": 6.0
},
"claude-haiku-4-5": {
"input": 1.0,
"output": 5.0,
"cacheRead": 0.1,
"cacheWrite": 2.0

Copilot uses AI. Check for mistakes.
}

// Check if this date is within range
const dirDate = new Date(`${year}-${month}-${day}`);
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The date construction from year/month/day directory names could fail or produce incorrect dates if the directory names are not zero-padded (e.g., "2024/1/5" instead of "2024/01/05"). Consider using explicit parsing with validation, or documenting the expected format. This could lead to dates being silently skipped or incorrectly compared.

Suggested change
const dirDate = new Date(`${year}-${month}-${day}`);
const yearNum = Number.parseInt(year, 10);
const monthNum = Number.parseInt(month, 10);
const dayNum = Number.parseInt(day, 10);
if (
Number.isNaN(yearNum) ||
Number.isNaN(monthNum) ||
Number.isNaN(dayNum) ||
monthNum < 1 ||
monthNum > 12 ||
dayNum < 1 ||
dayNum > 31
) {
// Skip directories that do not represent a valid date
continue;
}
const dirDate = new Date(yearNum, monthNum - 1, dayNum);

Copilot uses AI. Check for mistakes.
* /home/juha/code/foo -> -home-juha-code-foo
*/
const encodeRepoPath = (repoPath: string): string => {
return repoPath.replace(/\//g, "-");
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encodeRepoPath function replaces all forward slashes with hyphens, but this doesn't handle the leading slash correctly based on the comment example. For "/home/juha/code/foo", the result would be "-home-juha-code-foo" which matches the comment. However, this could be problematic on Windows systems where paths use backslashes. Consider documenting this limitation or adding platform-specific handling.

Suggested change
return repoPath.replace(/\//g, "-");
// Normalize Windows-style backslashes to forward slashes before encoding
const normalizedPath = repoPath.replace(/\\/g, "/");
return normalizedPath.replace(/\//g, "-");

Copilot uses AI. Check for mistakes.
Comment on lines 207 to 216
// Skip if cwd doesn't match repo
if (sessionData.cwd && !sessionData.cwd.startsWith(repoPath)) {
if (this.debug) {
this.logger.debug("Skipping Codex entry: cwd mismatch", {
cwd: sessionData.cwd,
repoPath,
});
}
continue;
}
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If sessionData.cwd is undefined, the cwd check is skipped and all entries would be included regardless of repository. This could lead to counting usage from other repositories. Consider either making the cwd check mandatory (skip entries without cwd) or documenting this as expected behavior if it's intentional.

Copilot uses AI. Check for mistakes.
Comment on lines 131 to 140
// Match repo path if cwd is present
if (entry.cwd && !entry.cwd.startsWith(repoPath)) {
if (this.debug) {
this.logger.debug("Skipping entry: cwd mismatch", {
cwd: entry.cwd,
repoPath,
});
}
continue;
}
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If entry.cwd is undefined, the cwd check is skipped and all entries would be included regardless of repository. This could lead to counting usage from other repositories. Consider either making the cwd check mandatory (skip entries without cwd) or documenting this as expected behavior if it's intentional.

Copilot uses AI. Check for mistakes.
Comment on lines 220 to 225
// Dedupe: Codex emits duplicate token_count events with identical values
const usageKey = `${usage.input_tokens}:${usage.output_tokens}:${usage.cached_input_tokens}`;
if (usageKey === lastUsageKey) {
continue;
}
lastUsageKey = usageKey;
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deduplication logic only checks if consecutive token_count events have identical values, but doesn't consider the timestamp. This means if two different API calls within the same session happen to have exactly the same token counts, the second one would be incorrectly skipped. Consider including the timestamp in the deduplication key or adding a small time window check to make the deduplication more robust.

Copilot uses AI. Check for mistakes.
"run:scrape-publications": "tsx src/cli/scrape-publications/main.ts",
"run:etf-backtest": "tsx src/cli/etf-backtest/main.ts",
"run:agent-evals": "tsx src/cli/agent-evals/main.ts",
"ai:usage": "tsx src/cli/ai-usage/main.ts",
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description contains contradictory statements about dependencies. It first says "added. (New dependencies for logging and argument parsing.)" but then later states "No new dependencies and no network calls." Looking at the package.json, no new dependencies were actually added. Consider updating the PR description to remove the contradictory statement about new dependencies being added.

Copilot uses AI. Check for mistakes.
…d deduplication

- Simplify repo path encoding in ClaudeLogReader
- Add tests for CodexLogReader to validate log file handling
- Improve date validation and deduplication logic in CodexLogReader
@valuecodes valuecodes merged commit 777d7a1 into main Jan 29, 2026
4 checks passed
@valuecodes valuecodes deleted the token-counter branch January 29, 2026 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants