Skip to content

Add AI agent detection to user-agent header#88

Merged
simonfaltum merged 2 commits into
mainfrom
simonfaltum/agent-detection
Apr 20, 2026
Merged

Add AI agent detection to user-agent header#88
simonfaltum merged 2 commits into
mainfrom
simonfaltum/agent-detection

Conversation

@simonfaltum

Copy link
Copy Markdown
Member

Summary

Adds detection for 15 AI coding agents (amp, antigravity, augment, claude-code, cline, codex, copilot-cli, copilot-vscode, cursor, gemini-cli, goose, kiro, openclaw, opencode, windsurf) so the SDK emits a single agent/<name> segment in its user-agent string when an agent is identified. Mirrors parallel work in the Go (databricks/databricks-sdk-go#1637), Java (databricks/databricks-sdk-java#768), and Python (databricks/databricks-sdk-py#1394) SDKs so all four SDKs ship the same canonical list and precedence rules.

Why

Databricks wants visibility into which AI coding agents are calling our APIs so that we can understand adoption, prioritize fixes for the environments our customers use, and detect compatibility issues early. The three sibling SDKs just landed this feature; the JS SDK has a smaller detection list (9 agents), emits one segment per detected agent instead of a single canonical segment, and does not honor the AGENT=<name> standard from agents.md. Without this change, traffic from JS SDK users running inside agents is invisible or reported inconsistently with the other SDKs.

The library policy in .agent/rules/libraries.mdc prefers picking a dependency over hand-rolling. We intentionally deviate here: the canonical agent list, env var names, and precedence rules are coordinated across four SDKs, and existing libraries (std-env, @vercel/detect-agent) cover different subsets of agents, apply different precedence, and would re-introduce drift the moment we add a new agent. Implementation is ~80 lines with zero dependencies and matches the Go/Java/Python implementations.

What changed

Interface changes

  • packages/core/src/clientinfo/agent.ts (new) - Exports agentProvider() (cached for the process lifetime) and lookupAgentProvider() (uncached, primarily for tests). clearAgentCache() is exported from the module file (not the barrel) for tests only, matching the pattern documented in .agent/rules/testing.mdc for intentionally-unbarreled symbols.
  • packages/core/src/clientinfo/index.ts - Adds agentProvider to the public barrel.

Behavioral changes

  • createDefault() now appends at most one agent/<name> segment instead of one per matching env var. When two explicit matchers fire simultaneously (ambiguity), no agent/ segment is emitted.
  • AGENT=<name> is now honored as a fallback. When no explicit env var matches, AGENT=<known-product> maps to that product, any other non-empty AGENT value maps to agent/unknown, and an empty or unset AGENT emits nothing.
  • Explicit env vars always win over AGENT=<name> (e.g. CLAUDECODE=1 + AGENT=goose reports claude-code).
  • Detection is cached for the process lifetime, matching Go's sync.Once, Java's volatile lazy init, and Python's _agent_provider sentinel.
  • Agent list grows from 9 to 15: adds amp, augment, copilot-vscode, goose, kiro, windsurf. Existing nine agents continue to work.

Internal changes

  • The inlined KNOWN_AGENTS list and detectAgents() function in packages/core/src/clientinfo/default.ts move to the new module.
  • The existing default.test.ts test case multiple agents all reported is replaced by multiple agents are ambiguous and omit the agent segment to reflect the new ambiguity semantics. Two new cases cover the AGENT fallback path. Adds clearAgentCache() calls in beforeEach/afterEach since detection is now cached.
  • packages/core/vitest.config.browser.ts excludes tests/clientinfo/agent.test.ts for the same reason default.test.ts is excluded: agent detection reads process.env and is Node-only.

How is this tested?

  • New packages/core/tests/clientinfo/agent.test.ts mirrors the Go test cases from useragent/agent_test.go: every agent detected via its primary env var, empty-string env values counting as set, ambiguity when two explicit matchers fire, AGENT fallback for known and unknown values, explicit env vars winning over AGENT=<name>, the pinned COPILOT_CLI + COPILOT_MODEL ambiguity case for Copilot CLI BYOK users, and cache persistence after env changes.
  • npm run format:check, npm run lint, npm run typecheck, npm test, and npm run test:browser all pass. Core package runs 240 unit tests (29 new) and 150 browser tests.

Adds detection for 15 AI coding agents (amp, antigravity, augment,
claude-code, cline, codex, copilot-cli, copilot-vscode, cursor,
gemini-cli, goose, kiro, openclaw, opencode, windsurf) to the core
user-agent. The detected product name is emitted as a single
agent/<name> segment by createDefault so that Databricks can understand
which agents are invoking the SDK.

Detection honors the agents.md standard AGENT env var with an unknown
fallback, and resolves ambiguity conservatively by emitting no segment
when two explicit matchers fire at once. Explicit product env vars
always take precedence over AGENT=<name>.

Behavior matches the parallel changes in databricks-sdk-go #1637,
databricks-sdk-java #768, and databricks-sdk-py #1394.

Signed-off-by: simon <simon.faltum@databricks.com>
@simonfaltum simonfaltum requested a review from parthban-db April 20, 2026 09:53
Comment thread packages/core/src/clientinfo/agent.ts
@simonfaltum simonfaltum requested a review from parthban-db April 20, 2026 10:51
Nested agents (e.g. a Cursor CLI subagent spawned by Claude Code) set
multiple agent env vars on the same process. The previous ambiguity
guard silently dropped the signal in that case. Report "multiple"
instead so the stacked case is visible in telemetry.

Also collapse the known BYOK false positive where Copilot CLI users
have COPILOT_MODEL set alongside COPILOT_CLI: that pair now reports
"copilot-cli" rather than "multiple".

Co-authored-by: Isaac
Signed-off-by: simon <simon.faltum@databricks.com>

@parthban-db parthban-db left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo comment.

export {ClientInfo, ClientInfoError} from './clientinfo';
export {addToDefault, setPartner, setProduct} from './base';
export {createDefault} from './default';
export {agentProvider} from './agent';

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we expose this? I don't think users need to access it.

@simonfaltum simonfaltum added this pull request to the merge queue Apr 20, 2026
Merged via the queue into main with commit ed7cf24 Apr 20, 2026
9 checks passed
@simonfaltum simonfaltum deleted the simonfaltum/agent-detection branch April 20, 2026 13:34
simonfaltum added a commit that referenced this pull request Apr 20, 2026
agentProvider is an internal detail of clientinfo. The only consumer
inside the SDK (default.ts) imports it directly from ./agent, so the
public re-export had no user-facing purpose. Addresses review feedback
from #88.

Co-authored-by: Isaac
Signed-off-by: simon <simon.faltum@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants