Skip to content

Add anonymous usage telemetry collection#66

Merged
alexkroman merged 4 commits into
mainfrom
claude/sleepy-allen-yuemuz
Jun 11, 2026
Merged

Add anonymous usage telemetry collection#66
alexkroman merged 4 commits into
mainfrom
claude/sleepy-allen-yuemuz

Conversation

@alexkroman

Copy link
Copy Markdown
Collaborator

Adds opt-out anonymous usage telemetry to track command invocations, outcomes, and performance metrics. Telemetry is collected via a detached background process using a Datadog write-only client token, ensuring it never blocks or breaks commands.

Key Changes

  • Core telemetry module (aai_cli/telemetry.py): Implements event collection, consent management, and background flushing

    • Collects command name, outcome, exit code, duration, CLI version, OS, Python version, CI detection, and a random device ID
    • Never collects arguments, file paths, transcripts, API keys, or account data
    • Delivery runs in a detached subprocess with stdio discarded, so failures are swallowed and never impact the command
    • Opt-out via AAI_TELEMETRY_DISABLED=1, DO_NOT_TRACK=1, or aai telemetry disable
    • Uses Datadog's browser-intake API with a public client token (pub…) designed for embedding
  • Telemetry command (aai_cli/commands/telemetry.py): User-facing consent surface

    • aai telemetry status: Show whether telemetry is active and why
    • aai telemetry enable/disable: Persistently opt in or out
    • Environment kill-switches always win over persisted choice
  • Config persistence (aai_cli/config.py):

    • device_id: Random UUID minted locally on first use, persisted in config.toml
    • telemetry_enabled: Persisted opt-out choice (None = never chosen, treated as enabled)
  • Command integration (aai_cli/context.py):

    • track() context manager wraps command execution to capture outcome and duration
    • Maps CLIError types to outcome strings, handles typer.Exit codes, catches unexpected exceptions as internal_error
    • Integrates with run_command() to track all CLI invocations
  • Comprehensive test coverage (tests/test_telemetry.py, tests/test_telemetry_command.py):

    • Token/URL resolution, consent logic, event shape validation
    • Detached flusher subprocess spawning and payload posting
    • Command tracking with error classification and duration measurement
    • Integration tests for aai telemetry commands
  • Documentation: Updated README with telemetry overview and opt-out instructions

Implementation Details

  • Telemetry is disabled by default in tests via neutralize_shipped_token fixture that blanks the shipped token suite-wide, allowing tests to opt in explicitly
  • Event dispatch uses subprocess.Popen with start_new_session=True and stdio discarded to ensure the flusher never blocks the command
  • Payload travels via argv as JSON (acceptable since it only contains the public client token and event data)
  • Consent model is opt-out: consent_granted() returns True unless env kill-switches or persisted False choice exist
  • is_enabled() requires both a token and consent to be true

https://claude.ai/code/session_017kPHAMnWAUs6XHVGk4FyWE

Model after the Supabase CLI: ship a write-only Datadog client token
(pub…, embeddable by design — deliberately committed and allowlisted in
gitleaks; never an API key) and send one allow-listed event per command
run to the Datadog logs intake.

- aai_cli/telemetry.py: consent (AAI_TELEMETRY_DISABLED / DO_NOT_TRACK /
  persisted choice), event build (command path, outcome class, exit code,
  duration, version/OS/CI — never args, paths, or account data), and a
  detached flusher subprocess so commands never wait on telemetry; all
  send-side failures are swallowed.
- context.run_command wraps each command body in telemetry.track(), so
  CLIError outcomes keep their machine-readable error_type.
- config.py: random anonymous device_id + telemetry_enabled persisted in
  config.toml.
- aai telemetry status/enable/disable: the user-facing consent surface,
  documented in README ("Telemetry") with the full opt-out story.
- Test suite blanks the shipped token via an autouse fixture so no test
  spawns a real flusher; telemetry paths are covered by dedicated unit,
  command, and run_command-integration tests.

https://claude.ai/code/session_017kPHAMnWAUs6XHVGk4FyWE
Comment thread aai_cli/telemetry.py
# `phc_` keys ship in open-source CLIs. An API key (account secret) must never
# appear here. Rotate in Datadog (Organization Settings → Client Tokens) if abused;
# AAI_TELEMETRY_CLIENT_TOKEN overrides without a release.
SHIPPED_CLIENT_TOKEN = "pub0d633113b9f7d22faff215fefaf30b43"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposed secret in aai_cli/telemetry.py - high severity
Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More Info

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AikidoSec ignore: This is a Datadog client token (pub…) — the write-only credential class Datadog designs to be embedded in shipped client apps. It can submit telemetry events but cannot read any data, so it is not an account secret. It is committed deliberately (the same model as PostHog phc_ keys in open-source CLIs, e.g. the Supabase CLI) and is allowlisted by exact value in .gitleaks.toml; any other real-looking token still fails the secret-scan gate.


Generated by Claude Code

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Based on your feedback, we ignored this issue because of the following reason:

This is a Datadog client token (pub…) — the write-only credential class Datadog designs to be embedded in shipped client apps. It can submit telemetry events but cannot read any data, so it is not an account secret. It is committed deliberately (the same model as PostHog phc_ keys in open-source CLIs, e.g. the Supabase CLI) and is allowlisted by exact value in .gitleaks.toml; any other real-looking token still fails the secret-scan gate.


Generated by Claude Code

Comment thread tests/test_telemetry.py
# The committed credential must be a Datadog *client* token (pub…, write-only,
# embeddable by design) — never an API key. The autouse fixture blanks it for
# the suite and hands back the real value for exactly this assertion.
assert neutralize_shipped_token == "pub0d633113b9f7d22faff215fefaf30b43"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposed secret in tests/test_telemetry.py - low severity
Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More Info

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AikidoSec ignore: Same value as the finding in aai_cli/telemetry.py — a write-only Datadog client token (pub…), committed deliberately and allowlisted by exact value in .gitleaks.toml. The test asserts the shipped constant is exactly this public client token (and never an API-key shape).


Generated by Claude Code

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Based on your feedback, we ignored this issue because of the following reason:

Same value as the finding in aai_cli/telemetry.py — a write-only Datadog client token (pub…), committed deliberately and allowlisted by exact value in .gitleaks.toml. The test asserts the shipped constant is exactly this public client token (and never an API-key shape).


Generated by Claude Code

Comment thread aai_cli/telemetry.py Outdated
# network. The payload travels via argv (the Vercel CLI hands its flusher the
# serialized request the same way); it carries only the event + the write-only
# public token, so argv visibility is acceptable.
_FLUSH_SNIPPET = "from aai_cli import telemetry; telemetry.flush_argv()"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spawning a detached Python subprocess via a constructed -c snippet and discarding stdio hides runtime behavior; consider avoiding dynamic -c execution or make the subprocess invocation explicit and visible to reviewers.

Details

✨ AI Reasoning
​The change introduces a detached flusher subprocess that runs Python code via the command-line -c snippet and discards stdio. The payload is passed as an argv string and then parsed in the child. Using subprocess.Popen with start_new_session=True and redirecting stdout/stderr to DEVNULL hides the child process's activity from the user and logs. Executing a dynamically-constructed snippet with -c increases the difficulty of static review and can conceal side effects. These patterns (dynamic -c execution, payload via argv, suppressed stdio, detached session) were not present before and were added by this diff. This raises a transparency/obfuscation concern under the rule: intent of that runtime behavior is hidden from observers and reviewers.

🔧 How do I fix it?
Ensure code is transparent and not intentionally obfuscated. Avoid hiding functionality from code review. Focus on intent and deception, not specific patterns.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

Comment thread aai_cli/telemetry.py
def dispatch(event: Mapping[str, object]) -> None:
"""Hand one event to a detached flusher process and return immediately."""
payload = json.dumps({"url": intake_url(), "token": client_token(), "event": event})
subprocess.Popen(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dispatch uses subprocess.Popen to run a hidden flusher (-c snippet) with the payload passed via argv and stdio discarded; this conceals execution and data flow from reviewers.

Details

✨ AI Reasoning
​dispatch() now serializes event+token into JSON and passes it to a detached child via process argv (subprocess.Popen([... , '-c', _FLUSH_SNIPPET, payload], ...)). Passing sensitive or operational data via dynamically passed argv to an -c snippet and running it with stdio suppressed obscures what is being sent and executed. While the payload here is intended to be public telemetry, this pattern was added by the PR and introduces an obfuscation-like execution path that makes review and auditing harder.

🔧 How do I fix it?
Ensure code is transparent and not intentionally obfuscated. Avoid hiding functionality from code review. Focus on intent and deception, not specific patterns.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

@alexkroman alexkroman enabled auto-merge (squash) June 11, 2026 17:31
Comment thread aai_cli/telemetry.py Outdated
)


def flush_argv() -> None:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flush_argv() reads and executes serialized payloads from sys.argv[1], invoked by a dynamically-executed -c snippet; this runtime indirection conceals the request flow and complicates auditability.

Details

✨ AI Reasoning
​flush_argv() reads sys.argv[1] (the payload) and then loads and posts it. This function is intended to be called by the dynamically-executed -c snippet in the detached child. That dynamic invocation pattern (assembling a snippet referencing module functions and passing a serialized payload through argv) is a deliberate runtime indirection added by this diff, which obscures the actual network request flow from casual code review.

🔧 How do I fix it?
Ensure code is transparent and not intentionally obfuscated. Avoid hiding functionality from code review. Focus on intent and deception, not specific patterns.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

claude added 3 commits June 11, 2026 17:41
_fake_ytdlp replaces sys.modules["yt_dlp"] with a non-package namespace,
so the --download-sections path's `from yt_dlp.utils import …` only
resolved when an earlier test had already imported the real module —
passing or failing with pytest-randomly's seed. Cache the real
yt_dlp.utils first so the import is deterministic.

https://claude.ai/code/session_017kPHAMnWAUs6XHVGk4FyWE
Address the Aikido review findings on the dispatch path: spawning a
dynamically-constructed `python -c` snippet is needlessly opaque. The
detached flusher is now the CLI's own hidden `telemetry flush` command
(the same shape the Vercel CLI uses) — an explicit, reviewable entry
point. The child env sets AAI_TELEMETRY_DISABLED=1 so a flush can never
spawn another flusher, and a test pins that `flush` stays hidden from
help.

https://claude.ai/code/session_017kPHAMnWAUs6XHVGk4FyWE
Resolves the tests/test_youtube.py conflict by taking main's version of
the flake fix (both sides fixed the same order-dependent yt_dlp.utils
import identically, differing only in the comment), keeps the telemetry
bullet in AGENTS.md, and follows the CLI rename through the telemetry
surface: command examples, the status hint, README disclosure, and the
command-path assertions (events now record "assembly telemetry status").
Internals are unchanged — the package is still aai_cli, so the detached
flusher's `python -m aai_cli telemetry flush` spawn and the AAI_*
telemetry env vars keep working.

https://claude.ai/code/session_017kPHAMnWAUs6XHVGk4FyWE
@alexkroman alexkroman merged commit 461862f into main Jun 11, 2026
11 checks passed
@alexkroman alexkroman deleted the claude/sleepy-allen-yuemuz branch June 11, 2026 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants