docs(llma): migrate onboarding docs to OpenTelemetry auto-instrumentation#53668
Conversation
…tion Replace PostHog SDK wrapper approach with standard OpenTelemetry auto-instrumentation across 22 provider onboarding guides. Each doc now follows a 3-step pattern: install OTel deps, set up tracing, call the provider with the native SDK.
- Remove user.id from Vercel AI resource (posthog_distinct_id in experimental_telemetry is the established mechanism for that SDK) - Normalize service.name to 'my-app' in Vercel AI doc - Add anonymous events Blockquote to langchain, langgraph, llamaindex - Add Node tab to OpenAI embeddings step
|
…smolagents, pydantic-ai docs to OTel
Match the posthog-js #3358 API: drop the manual SimpleSpanProcessor + PostHogTraceExporter wiring in favor of PostHogSpanProcessor. Convex keeps PostHogTraceExporter since BatchSpanProcessor doesn't work in V8.
Match posthog-python #494: drop the manual SimpleSpanProcessor + OTLPSpanExporter wiring in favor of PostHogSpanProcessor, and swap opentelemetry-exporter-otlp-proto-http for posthog[otel] in the pip install commands.
|
Docs from this PR will be published at posthog.com
Preview will be ready in ~10 minutes. Click Preview link above to access docs at |
This stack of pull requests is managed by Graphite. Learn more about stacking. |
## Changes Updates `tableOfContents` frontmatter in LLM analytics installation docs to match the new step titles from the OTel migration in PostHog/posthog#53668. Each migrated doc now uses a 3-step TOC: 1. Install dependencies 2. Set up OpenTelemetry tracing 3. Call/Run {provider} 4. Verify traces and generations **Docs updated:** - OpenAI-compatible: openai, groq, deepseek, mistral, cohere, fireworks-ai, xai, hugging-face, openrouter, together-ai, ollama, cerebras, perplexity - Gateways: portkey, helicone - Azure OpenAI, Anthropic - Frameworks: langchain, langgraph, llamaindex - autogen, instructor, mirascope, semantic-kernel, smolagents, pydantic-ai - Mastra (now using `@mastra/posthog` exporter — 3-step TOC) - Manual capture (added a TOC for the first time — Capture LLM events manually → Event properties → Verify traces and generations) **Also includes** a branch-pinning change in `gatsby-config.js` and `gatsby/onPreBootstrap.ts` to point at `docs/otel-onboarding-docs` in the posthog repo, so the Vercel preview build shows the new docs before PostHog/posthog#53668 lands. **Must be reverted to `master` before merging.** **Note:** This PR should be merged after PostHog/posthog#53668 lands, since the TOC entries reference the new step titles from that PR. ## Checklist - [x] I've read the [docs](https://posthog.com/handbook/docs-and-wizard/docs-style-guide) and/or [content](https://posthog.com/handbook/content/posthog-style-guide) style guides. - [x] Words are spelled using American English - [x] Use relative URLs for internal links - [x] I've checked the pages added or changed in the Vercel preview build - [x] If I moved a page, I added a redirect in `vercel.json`
…482) ## Problem Our AI examples use PostHog's direct SDK wrappers (`posthog.ai.openai`, `posthog.ai.anthropic`, etc.) for tracking LLM calls. We want to silently deprecate these in favor of standard OpenTelemetry auto-instrumentation, which is more portable and follows industry conventions. ## Changes Migrates Python AI examples from PostHog wrappers to OpenTelemetry auto-instrumentation: - **OpenAI-compatible providers** (Groq, DeepSeek, Mistral, xAI, Together AI, Ollama, Cohere, Hugging Face, Perplexity, Cerebras, Fireworks AI, OpenRouter, Helicone, Vercel AI Gateway, Portkey) → `opentelemetry-instrumentation-openai-v2` - **OpenAI** (all files), **Azure OpenAI**, **Instructor**, **Autogen**, **Mirascope**, **Semantic Kernel**, **smolagents** → `opentelemetry-instrumentation-openai-v2` - **Anthropic** (chat, streaming, extended thinking) → `opentelemetry-instrumentation-anthropic` - **LangChain**, **LangGraph** → `opentelemetry-instrumentation-langchain` - **LlamaIndex** → `opentelemetry-instrumentation-llamaindex` - **Gemini** → `opentelemetry-instrumentation-google-generativeai` All OTel-based examples set resource attributes to demonstrate the full feature set: ```python resource = Resource( attributes={ SERVICE_NAME: "example-groq-app", "posthog.distinct_id": "example-user", "foo": "bar", "conversation_id": "abc-123", } ) ``` These map to `distinct_id` and custom event properties via PostHog's OTLP ingestion endpoint. **Kept as-is:** CrewAI (uses LiteLLM callbacks, internally manages its own TracerProvider), LiteLLM/DSPy (use LiteLLM's built-in PostHog callback), OpenAI Agents (uses dedicated `posthog.ai.openai_agents.instrument()`), Pydantic AI (already OTel via `Agent.instrument_all()`), AWS Bedrock (already OTel via `opentelemetry-instrumentation-botocore`). Key implementation details: - Uses `SimpleSpanProcessor` instead of `BatchSpanProcessor` so spans export immediately without needing `provider.shutdown()` - `# noqa: E402` on intentional late imports after `Instrumentor().instrument()` calls - Azure OpenAI example uses the generic `gpt-4o` model name instead of a deployment-specific one ## How did you test this code? Manually ran each example against real provider API keys via `llm-analytics-apps/run-examples.sh` to verify: 1. Each script runs successfully end-to-end 2. Traces arrive at PostHog as `$ai_generation` events 3. Resource attributes (`posthog.distinct_id`, `foo`, `conversation_id`) flow through as event properties 4. `distinct_id` is correctly set on each event All examples passing `ruff format` and `ruff check`. This is an agent-authored PR — I haven't manually tested each provider end-to-end beyond spot checks, though all examples follow the same pattern and the migration was verified on several providers. ## Publish to changelog? No ## Docs update The onboarding docs will be updated separately in PostHog/posthog#53668 and PostHog/posthog.com#16236. ## 🤖 LLM context Co-authored with Claude Code. Related PRs: - PostHog/posthog-js#3349 (Node.js examples) - PostHog/posthog#53668 (in-app onboarding docs) - PostHog/posthog.com#16236 (docs.posthog.com TOC updates)
…tion (#53668) ## Problem The in-app LLM analytics onboarding docs currently show PostHog SDK wrappers (`posthog.ai.openai`, `@posthog/ai`) as the primary integration method. We're moving toward standard OpenTelemetry auto-instrumentation as the recommended approach, keeping the wrappers available as a last resort for users who need them. ## Changes Migrates 28 provider onboarding guides from PostHog SDK wrapper approach to OpenTelemetry auto-instrumentation. Each migrated doc follows a consistent 3-step pattern: 1. **Install dependencies** — OTel SDK + provider-specific instrumentation + provider SDK 2. **Set up OpenTelemetry tracing** — `TracerProvider` with `OTLPSpanExporter` (Python) or `PostHogTraceExporter` (Node) 3. **Call the provider** — Native SDK usage, no `posthog_` parameters User identification uses the `posthog.distinct_id` resource attribute; custom properties like `foo` and `conversation_id` demonstrate the passthrough behavior. **Providers migrated (21 original + 7 more):** - OpenAI-compatible: openai, groq, deepseek, mistral, cohere, fireworks-ai, xai, hugging-face, openrouter, together-ai, ollama, cerebras, perplexity - AI gateways: portkey, helicone, vercel-ai-gateway - Azure OpenAI, Anthropic - Frameworks: langchain, langgraph, llamaindex - Vercel AI (SimpleSpanProcessor update) - **Newly migrated:** autogen, instructor, mirascope, semantic-kernel, smolagents, pydantic-ai (already OTel-based in examples — docs just hadn't caught up) - **Mastra** (migrated to `@mastra/posthog` exporter — Mastra's native integration since there's no mature OTel path yet) **Kept as-is:** google (no Node.js OTel instrumentation for `@google/genai`), aws-bedrock (already OTel), crewai, litellm, dspy, openai-agents, manual **Migration callout:** Every migrated doc has a callout at the top linking to the full Node.js and Python examples on GitHub (current `main`/`master`), plus legacy wrapper examples pinned to the last commit before the migration. This helps both new users (who see the recommended OTel path) and existing users (who can find the old wrapper examples they're already using). **Manual capture doc:** Split into multiple steps so it can have a useful table of contents (Capture LLM events manually → Event properties → Verify traces and generations). **Removed:** "No proxy" callouts from all migrated docs — they made sense when wrapping SDK clients but don't apply to standard OTel tracing. ## How did you test this code? Agent-authored PR. Manually verified: - TypeScript compiles (`pnpm --filter=@posthog/frontend format` runs cleanly) - Each migrated doc follows the same pattern as other OTel docs - Example code in the docs matches what's in the real example repos (PostHog/posthog-js#3349 and PostHog/posthog-python#482) No runtime tests since these are in-app onboarding content (static TSX returning step definitions). ## Publish to changelog? No ## Docs update This IS the docs update. Companion PR: PostHog/posthog.com#16236 (updates TOC frontmatter in posthog.com). ## 🤖 LLM context Co-authored with Claude Code. Related PRs: - PostHog/posthog-js#3349 (Node.js OTel examples) - PostHog/posthog-python#482 (Python OTel examples) - PostHog/posthog.com#16236 (docs.posthog.com TOC updates)

Problem
The in-app LLM analytics onboarding docs currently show PostHog SDK wrappers (
posthog.ai.openai,@posthog/ai) as the primary integration method. We're moving toward standard OpenTelemetry auto-instrumentation as the recommended approach, keeping the wrappers available as a last resort for users who need them.Changes
Migrates 28 provider onboarding guides from PostHog SDK wrapper approach to OpenTelemetry auto-instrumentation. Each migrated doc follows a consistent 3-step pattern:
TracerProviderwithOTLPSpanExporter(Python) orPostHogTraceExporter(Node)posthog_parametersUser identification uses the
posthog.distinct_idresource attribute; custom properties likefooandconversation_iddemonstrate the passthrough behavior.Providers migrated (21 original + 7 more):
@mastra/posthogexporter — Mastra's native integration since there's no mature OTel path yet)Kept as-is: google (no Node.js OTel instrumentation for
@google/genai), aws-bedrock (already OTel), crewai, litellm, dspy, openai-agents, manualMigration callout: Every migrated doc has a callout at the top linking to the full Node.js and Python examples on GitHub (current
main/master), plus legacy wrapper examples pinned to the last commit before the migration. This helps both new users (who see the recommended OTel path) and existing users (who can find the old wrapper examples they're already using).Manual capture doc: Split into multiple steps so it can have a useful table of contents (Capture LLM events manually → Event properties → Verify traces and generations).
Removed: "No proxy" callouts from all migrated docs — they made sense when wrapping SDK clients but don't apply to standard OTel tracing.
How did you test this code?
Agent-authored PR. Manually verified:
pnpm --filter=@posthog/frontend formatruns cleanly)No runtime tests since these are in-app onboarding content (static TSX returning step definitions).
Publish to changelog?
No
Docs update
This IS the docs update. Companion PR: PostHog/posthog.com#16236 (updates TOC frontmatter in posthog.com).
🤖 LLM context
Co-authored with Claude Code. Related PRs: