diff --git a/browsers/telemetry.mdx b/browsers/telemetry.mdx new file mode 100644 index 0000000..e547d35 --- /dev/null +++ b/browsers/telemetry.mdx @@ -0,0 +1,561 @@ +--- +title: "Browser Telemetry" +description: "Stream real-time console, network, page, and interaction events from your browser sessions" +--- + +Browser telemetry gives you a real-time stream of everything happening inside a session - console output, network requests, page lifecycle events, and user interactions. Events are delivered over [Server-Sent Events (SSE)](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events), so you can process them as they happen or feed them into your own observability pipeline. + +## Enabling telemetry + +Pass `telemetry.enabled` when creating a browser to start capturing with VM defaults: + + +```typescript Typescript/Javascript +import Kernel from '@onkernel/sdk'; + +const kernel = new Kernel(); + +const kernelBrowser = await kernel.browsers.create({ + telemetry: { enabled: true }, +}); +``` + +```python Python +from kernel import Kernel + +kernel = Kernel() + +kernel_browser = kernel.browsers.create( + telemetry={"enabled": True}, +) +``` + +```bash CLI +kernel browsers create --telemetry=all +``` + + +To capture only specific categories, pass per-category flags under `telemetry.browser`: + + +```typescript Typescript/Javascript +const kernelBrowser = await kernel.browsers.create({ + telemetry: { + enabled: true, + browser: { + console: { enabled: true }, + network: { enabled: true }, + page: { enabled: false }, + interaction: { enabled: false }, + }, + }, +}); +``` + +```python Python +kernel_browser = kernel.browsers.create( + telemetry={ + "enabled": True, + "browser": { + "console": {"enabled": True}, + "network": {"enabled": True}, + "page": {"enabled": False}, + "interaction": {"enabled": False}, + }, + }, +) +``` + +```bash CLI +kernel browsers create --telemetry=console=on,network=on,page=off,interaction=off +``` + + +Omitted categories default to enabled. Setting `enabled: false` at the top level (or `--telemetry=off` in the CLI) disables all capture and can't be combined with category settings. + +## Consuming the stream + +The telemetry endpoint streams events as SSE. Each frame carries a JSON object with a monotonic sequence number and the event payload: + + +```typescript Typescript/Javascript +const response = await fetch( + `https://api.onkernel.com/v1/browsers/${kernelBrowser.session_id}/telemetry`, + { + headers: { + 'Authorization': `Bearer ${process.env.KERNEL_API_KEY}`, + 'Accept': 'text/event-stream', + }, + }, +); + +const reader = response.body!.getReader(); +const decoder = new TextDecoder(); +let buffer = ''; + +while (true) { + const { done, value } = await reader.read(); + if (done) break; + + buffer += decoder.decode(value, { stream: true }); + const lines = buffer.split('\n'); + buffer = lines.pop()!; + + for (const line of lines) { + if (!line.startsWith('data: ')) continue; + const envelope = JSON.parse(line.slice(6)); + console.log(envelope.event.type, envelope.event); + } +} +``` + +```python Python +import httpx +import json +import os + +with httpx.stream( + "GET", + f"https://api.onkernel.com/v1/browsers/{kernel_browser.session_id}/telemetry", + headers={ + "Authorization": f"Bearer {os.environ['KERNEL_API_KEY']}", + "Accept": "text/event-stream", + }, +) as response: + for line in response.iter_lines(): + if not line.startswith("data: "): + continue + envelope = json.loads(line[6:]) + print(envelope["event"]["type"], envelope["event"]) +``` + + +The stream stays open until the browser session terminates. A keepalive comment is sent every 15 seconds when no events arrive. + +### CLI streaming + +The CLI provides a dedicated `telemetry stream` command with built-in filtering: + +```bash +# Stream all events (human-readable) +kernel browsers telemetry stream + +# Stream as NDJSON +kernel browsers telemetry stream -o json + +# Filter by category +kernel browsers telemetry stream --categories network,console + +# Filter by specific event types +kernel browsers telemetry stream --types network_request,network_response + +# Resume from a sequence number +kernel browsers telemetry stream --seq 42 +``` + +The default output is tab-separated (`timestamp [category] type`). Use `-o json` for NDJSON when piping into other tools. + +## Reconnecting + +Each SSE frame includes an `id:` field set to the event's sequence number. If your connection drops, pass the last received sequence number as `Last-Event-ID` to resume without gaps: + + +```typescript Typescript/Javascript +const lastSeq = 42; // last seq you successfully processed + +const response = await fetch( + `https://api.onkernel.com/v1/browsers/${kernelBrowser.session_id}/telemetry`, + { + headers: { + 'Authorization': `Bearer ${process.env.KERNEL_API_KEY}`, + 'Accept': 'text/event-stream', + 'Last-Event-ID': String(lastSeq), + }, + }, +); +``` + +```python Python +last_seq = 42 # last seq you successfully processed + +with httpx.stream( + "GET", + f"https://api.onkernel.com/v1/browsers/{kernel_browser.session_id}/telemetry", + headers={ + "Authorization": f"Bearer {os.environ['KERNEL_API_KEY']}", + "Accept": "text/event-stream", + "Last-Event-ID": str(last_seq), + }, +) as response: + for line in response.iter_lines(): + if not line.startswith("data: "): + continue + envelope = json.loads(line[6:]) + print(envelope["event"]["type"], envelope["event"]) +``` + +```bash CLI +kernel browsers telemetry stream --seq 42 +``` + + +Gaps in received `seq` values indicate dropped events. + +## Updating telemetry on a running session + +You can enable, disable, or reconfigure telemetry categories on an active session with a PATCH: + + +```typescript Typescript/Javascript +await kernel.browsers.update(kernelBrowser.session_id, { + telemetry: { + browser: { + network: { enabled: false }, + }, + }, +}); +``` + +```python Python +kernel.browsers.update( + kernel_browser.session_id, + telemetry={ + "browser": { + "network": {"enabled": False}, + }, + }, +) +``` + +```bash CLI +kernel browsers update --telemetry=network=off +``` + + +Omitted categories are left unchanged - only the categories you specify are modified. To disable all telemetry on a running session, set `enabled: false`: + + +```typescript Typescript/Javascript +await kernel.browsers.update(kernelBrowser.session_id, { + telemetry: { enabled: false }, +}); +``` + +```python Python +kernel.browsers.update( + kernel_browser.session_id, + telemetry={"enabled": False}, +) +``` + +```bash CLI +kernel browsers update --telemetry=off +``` + + +## Event body + +Every SSE frame carries a JSON object with this shape: + +```json +{ + "seq": 17, + "event": { + "ts": 1716900000000000, + "type": "console_log", + "source": { + "kind": "cdp", + "event": "Runtime.consoleAPICalled" + }, + "data": { + "level": "log", + "text": "Page loaded successfully", + "args": ["Page loaded successfully"], + "url": "https://example.com", + "target_type": "page" + } + } +} +``` + +| Field | Description | +|---|---| +| `seq` | Monotonic sequence number, also the SSE `id:` field. Use for reconnection via `Last-Event-ID`. | +| `event.ts` | Event timestamp in Unix microseconds. | +| `event.type` | Discriminator for the event type (e.g. `network_request`, `page_load`). | +| `event.source.kind` | Where the event originated: `cdp` (Chrome DevTools), `kernel_api` (Kernel platform), `extension` (browser extension), or `local_process` (VM process). | +| `event.data` | Event-specific payload. CDP-sourced events include context fields for correlation (see below). | + +### Shared context fields + +CDP-sourced events include these fields in `event.data` for correlating events across tabs, navigations, and request chains: + +| Field | Type | Description | +|---|---|---| +| `session_id` | string | CDP session identifier for the target connection | +| `target_id` | string | Browser target identifier (stable across navigations within a tab) | +| `target_type` | string | Target type: `page`, `background_page`, `service_worker`, `shared_worker`, or `other` | +| `frame_id` | string | Frame within the target | +| `loader_id` | string | Document loader identifier, resets on each navigation | +| `url` | string | URL of the frame or request at the time of the event | +| `nav_seq` | integer | Monotonically increasing navigation sequence number within the target | + + +Monitor events (`monitor_*`) and `page_tab_opened` do not include shared context fields - they originate outside of a CDP session. + + +## Event types + +Telemetry events fall into four configurable categories plus a set of monitor events that are always emitted. + +### Console + +Console output and uncaught exceptions from the browser. + +| Type | Description | +|---|---| +| `console_log` | `console.log`, `console.info`, `console.warn`, and other non-error console calls | +| `console_error` | `console.error` calls and uncaught JavaScript exceptions | + +#### `console_log` data + +| Field | Type | Description | +|---|---|---| +| `level` | string | Console API method (`log`, `info`, `warn`, `debug`, `dir`, etc.) | +| `text` | string | First argument coerced to string | +| `args` | string[] | All arguments coerced to strings | +| `stack_trace` | object | JavaScript call stack (when available) | + +#### `console_error` data + +Emitted from two sources: `console.error()` calls and uncaught exceptions. Some fields are only present for one source. + +| Field | Type | Description | +|---|---|---| +| `text` | string | Error message text | +| `stack_trace` | object | Call stack | +| `level` | string | Always `"error"` (console.error only) | +| `args` | string[] | All arguments coerced to strings (console.error only) | +| `line` | integer | Line number (uncaught exceptions only) | +| `column` | integer | Column number (uncaught exceptions only) | +| `source_url` | string | Script URL (uncaught exceptions only) | + +### Network + +HTTP request and response metadata flowing through the browser. + +| Type | Description | +|---|---| +| `network_request` | Outgoing HTTP request (URL, method, headers, post data) | +| `network_response` | HTTP response (status code, headers, body) | +| `network_loading_failed` | Failed network request (error text, canceled flag) | +| `network_idle` | No in-flight requests for 500 ms (matches Playwright's `networkidle` heuristic) | + +#### `network_request` data + +| Field | Type | Description | +|---|---|---| +| `request_id` | string | Unique request identifier within the session | +| `method` | string | HTTP method (GET, POST, etc.) | +| `document_url` | string | URL of the document that initiated the request | +| `headers` | object | Request headers | +| `initiator_type` | string | What triggered the request (`script`, `parser`, `preload`, `other`) | +| `resource_type` | string | Resource type (`Document`, `Fetch`, `XHR`, `Script`, `Stylesheet`, `Image`, etc.) | +| `post_data` | string | Request body for POST/PUT requests | +| `is_redirect` | boolean | True if this request results from a redirect | +| `redirect_url` | string | Original URL before redirect | + +#### `network_response` data + +| Field | Type | Description | +|---|---|---| +| `request_id` | string | Matches the originating `network_request` | +| `method` | string | HTTP method of the original request | +| `status` | integer | HTTP status code | +| `status_text` | string | HTTP status text (e.g. "OK", "Not Found") | +| `headers` | object | Response headers | +| `mime_type` | string | MIME type (e.g. `text/html`, `application/json`) | +| `resource_type` | string | Resource type | +| `body` | string | Truncated response body (text MIME types only) | + + +Response bodies are truncated at 8 KB for structured types (JSON, XML, form data) and 4 KB for other text. Binary responses (images, fonts, media) are excluded. + + +#### `network_loading_failed` data + +| Field | Type | Description | +|---|---|---| +| `request_id` | string | Matches the originating `network_request` | +| `error_text` | string | Error description (e.g. `net::ERR_CONNECTION_REFUSED`) | +| `canceled` | boolean | True if canceled by browser or page script | +| `resource_type` | string | Resource type | + +#### `network_idle` data + +No additional fields beyond the [shared context fields](#shared-context-fields). Fires after 500 ms with no in-flight HTTP requests. + +### Page + +Page lifecycle events from navigation through layout stability. + +| Type | Description | +|---|---| +| `page_navigation` | Navigation to a new URL | +| `page_dom_content_loaded` | DOMContentLoaded fired | +| `page_load` | Page load complete | +| `page_tab_opened` | New tab or window opened | +| `page_layout_shift` | Cumulative Layout Shift (CLS) detected | +| `page_lcp` | Largest Contentful Paint recorded | +| `page_layout_settled` | No layout shifts for 1 s after `page_load` | +| `page_navigation_settled` | Both `page_dom_content_loaded` and `page_layout_settled` have fired for the current navigation | + +#### `page_navigation` data + +| Field | Type | Description | +|---|---|---| +| `session_id` | string | CDP session identifier | +| `target_id` | string | Browser target identifier | +| `target_type` | string | Target type (`page`, `background_page`, `service_worker`, etc.) | +| `url` | string | URL navigated to | +| `frame_id` | string | Frame that navigated | +| `loader_id` | string | New document loader identifier | +| `parent_frame_id` | string | Parent frame for subframe navigations (absent for top-level) | + + +`page_navigation` does not include `nav_seq` because this event resets the navigation epoch. Subsequent events in the same navigation share the new `nav_seq` value. + + +#### `page_dom_content_loaded` / `page_load` data + +| Field | Type | Description | +|---|---|---| +| `cdp_timestamp` | number | Chrome monotonic clock time in seconds when the event fired | + +#### `page_tab_opened` data + +| Field | Type | Description | +|---|---|---| +| `target_id` | string | Target identifier for the new tab | +| `target_type` | string | Target type | +| `url` | string | Initial URL | +| `title` | string | Initial page title | +| `opener_id` | string | Target identifier of the tab that opened this one | + + +`page_tab_opened` fires before CDP attaches to the new tab, so `session_id`, `frame_id`, `loader_id`, and `nav_seq` are absent. + + +#### `page_layout_shift` data + +| Field | Type | Description | +|---|---|---| +| `source_frame_id` | string | Frame where the layout shift occurred | +| `time` | number | Performance Timeline timestamp in milliseconds | +| `duration` | number | Duration of the shift entry in ms | +| `layout_shift_details.value` | number | Layout shift score (contribution to CLS) | +| `layout_shift_details.had_recent_input` | boolean | True if preceded by user input within 500 ms (excluded from CLS) | + +#### `page_lcp` data + +| Field | Type | Description | +|---|---|---| +| `source_frame_id` | string | Frame where LCP element was rendered | +| `time` | number | Performance Timeline timestamp in milliseconds | +| `lcp_details.render_time` | number | Render time of LCP element in ms | +| `lcp_details.load_time` | number | Load time of LCP element in ms | +| `lcp_details.size` | number | Visible area in pixels squared | +| `lcp_details.element_id` | string | `id` attribute of the LCP element | +| `lcp_details.url` | string | URL of the element (images/video) | + +#### `page_layout_settled` / `page_navigation_settled` data + +No additional fields beyond the [shared context fields](#shared-context-fields). + +### Interaction + +Browser-native interaction events from clicks, typing, and scrolling inside the page. These capture DOM-level interactions, not [computer controls](/browsers/computer-controls) API calls. Events are rate-limited to 20 per second per type to prevent flooding. + +| Type | Description | +|---|---| +| `interaction_click` | Mouse click (coordinates, target selector) | +| `interaction_key` | Keyboard input | +| `interaction_scroll_settled` | Scroll position stabilized after scrolling | + +#### `interaction_click` data + +| Field | Type | Description | +|---|---|---| +| `x` | integer | Viewport x-coordinate in CSS pixels | +| `y` | integer | Viewport y-coordinate in CSS pixels | +| `selector` | string | CSS selector path to the clicked element | +| `tag` | string | HTML tag name (e.g. `BUTTON`, `A`, `DIV`) | +| `text` | string | Visible text content of the element (suppressed for sensitive fields) | + +#### `interaction_key` data + +| Field | Type | Description | +|---|---|---| +| `key` | string | Key value (e.g. `Enter`, `Backspace`, `a`) | +| `selector` | string | CSS selector path to the focused element | +| `tag` | string | HTML tag name (e.g. `INPUT`, `TEXTAREA`) | + +#### `interaction_scroll_settled` data + +| Field | Type | Description | +|---|---|---| +| `from_x` | integer | Scroll x-position at start of gesture | +| `from_y` | integer | Scroll y-position at start of gesture | +| `to_x` | integer | Final scroll x-position | +| `to_y` | integer | Final scroll y-position | +| `target_selector` | string | CSS selector of the scrolled element | + + +Sensitive input fields (password fields, credit card numbers, SSNs) are automatically detected. Key events are suppressed entirely for these fields, and click events omit text content but still include position and selector. + + +### Monitor + +Monitor events are always emitted regardless of category configuration. They report on the health of the telemetry stream itself. + +| Type | Description | +|---|---| +| `monitor_screenshot` | Screenshot captured (triggered on page load and exceptions, rate-limited to one every 2 s) | +| `monitor_disconnected` | CDP connection to Chrome lost - events may be dropped until reconnection | +| `monitor_reconnected` | CDP connection re-established (auto-reconnect with exponential backoff up to 10 retries) | +| `monitor_reconnect_failed` | All reconnection attempts exhausted - no further events will arrive | +| `monitor_init_failed` | Telemetry initialization failed on the VM | + +#### `monitor_screenshot` data + +| Field | Type | Description | +|---|---|---| +| `png` | string | Base64-encoded PNG screenshot of the browser viewport | + +#### `monitor_disconnected` data + +| Field | Type | Description | +|---|---|---| +| `reason` | string | Reason for disconnection (e.g. `chrome_restarted`) | + +#### `monitor_reconnected` data + +| Field | Type | Description | +|---|---|---| +| `reconnect_duration_ms` | integer | Time in milliseconds taken to reconnect | + +#### `monitor_reconnect_failed` data + +| Field | Type | Description | +|---|---|---| +| `reason` | string | Reason for failure (e.g. `reconnect_exhausted`) | + +#### `monitor_init_failed` data + +| Field | Type | Description | +|---|---|---| +| `step` | string | The initialization step that failed (e.g. `Target.setAutoAttach`) | + + +After a `monitor_disconnected` event, treat any in-progress computed state (like `network_idle` or `page_layout_settled`) as unreliable until `monitor_reconnected` arrives. + diff --git a/docs.json b/docs.json index 763e3d5..85942cd 100644 --- a/docs.json +++ b/docs.json @@ -81,6 +81,7 @@ "expanded": true, "pages": [ "browsers/live-view", + "browsers/telemetry", "browsers/termination", "browsers/standby", "browsers/headless", diff --git a/introduction/observe.mdx b/introduction/observe.mdx index fba6a8f..a88b99a 100644 --- a/introduction/observe.mdx +++ b/introduction/observe.mdx @@ -3,7 +3,7 @@ title: "Observe" description: "Watch your agent work, debug what went wrong" --- -Browser agents fail in ways that don't show up in logs. Kernel gives you four ways to see what's actually happening — live, after the fact, frame by frame, and line by line. +Browser agents fail in ways that don't show up in logs. Kernel gives you four ways to see what's actually happening - live, after the fact, frame by frame, and line by line. ## Live view @@ -37,7 +37,7 @@ Add `?readOnly=true` for a non-interactive view, or enable [kiosk mode](/browser ## Replays -Replays are MP4 recordings you start and stop on demand — capture as many clips per session as you need. They're the right tool for post-hoc debugging: a failed run gives you one or more videos to scrub through, share, or attach to a bug report. +Replays are MP4 recordings you start and stop on demand - capture as many clips per session as you need. They're the right tool for post-hoc debugging: a failed run gives you one or more videos to scrub through, share, or attach to a bug report. Replays can also be enabled on managed auth sessions, so you can [debug failed logins](https://www.kernel.sh/docs/auth/configuration#record-sessions-for-debugging) the same way. @@ -79,7 +79,7 @@ Full reference: [Replays](/browsers/replays). ## Screenshots -Pull a frame at any moment with computer controls — useful for snapshotting state at decision points, attaching to traces, or feeding back into a vision model. +Pull a frame at any moment with computer controls - useful for snapshotting state at decision points, attaching to traces, or feeding back into a vision model. ```typescript Typescript/Javascript @@ -101,6 +101,50 @@ kernel browsers computer screenshot --to screenshot.png For full-page captures, use [Playwright execution](/browsers/playwright-execution#screenshots) instead. +## Browser telemetry + +Stream real-time events from the browser - console output, network requests, page lifecycle, and interactions - over Server-Sent Events. Enable it at creation, tune which categories you capture, and pipe the stream into your own observability stack. + + +```typescript Typescript/Javascript +const kernelBrowser = await kernel.browsers.create({ + telemetry: { enabled: true }, +}); + +const response = await fetch( + `https://api.onkernel.com/v1/browsers/${kernelBrowser.session_id}/telemetry`, + { + headers: { + 'Authorization': `Bearer ${process.env.KERNEL_API_KEY}`, + 'Accept': 'text/event-stream', + }, + }, +); +``` + +```python Python +kernel_browser = kernel.browsers.create( + telemetry={"enabled": True}, +) + +# Stream events via SSE +# GET /v1/browsers/{session_id}/telemetry +``` + +```bash CLI +# Create with telemetry enabled +kernel browsers create --telemetry=all + +# Stream events +kernel browsers telemetry stream + +# Stream as NDJSON, filtered to network events +kernel browsers telemetry stream -o json --categories network +``` + + +Full reference: [Browser Telemetry](/browsers/telemetry). + ## Invocation logs If you're running an agent on Kernel's [app platform](/apps/develop), every invocation produces a streaming log feed. Tail it live while the agent runs, or pull it after the fact for debugging. @@ -128,5 +172,5 @@ Full reference: [Logs](/apps/logs). - **Building the agent?** Keep a [live view](/browsers/live-view) tab open while you iterate. - **Debugging a failure?** Capture a [replay](/browsers/replays) for the run, then watch the video. -- **Instrumenting the agent itself?** Drop [screenshots](/browsers/computer-controls#take-screenshots) and [logs](/apps/logs) into your traces at the points that matter. +- **Instrumenting the agent itself?** Stream [browser telemetry](/browsers/telemetry) into your pipeline, or drop [screenshots](/browsers/computer-controls#take-screenshots) and [logs](/apps/logs) into your traces at the points that matter. - **Putting a human in the loop?** Embed the [live view](/browsers/live-view#embedding-in-an-iframe) in your own UI. diff --git a/reference/cli/browsers.mdx b/reference/cli/browsers.mdx index 0b7586e..ba0d69a 100644 --- a/reference/cli/browsers.mdx +++ b/reference/cli/browsers.mdx @@ -22,6 +22,7 @@ Create a new browser session. | `--headless` | Launch without GUI/VNC access. | | `--kiosk` | Launch in Chrome kiosk mode. | | `--start-url ` | Initial page to open on launch. | +| `--telemetry ` | Enable browser telemetry. `all` enables all categories, `off` disables, or use per-category config like `network=on,page=off`. | | `--output json`, `-o json` | Output raw JSON object. | ### `kernel browsers delete ` @@ -45,6 +46,25 @@ Get detailed information about a browser session. |------|-------------| | `--output json`, `-o json` | Output raw JSON object. | +### `kernel browsers update ` +Update an existing browser session. + +| Flag | Description | +|------|-------------| +| `--telemetry ` | Update telemetry config. `all` enables all categories, `off` disables, or use per-category config like `network=on,page=off`. Unspecified categories retain their current state. | + +## Browser telemetry + +### `kernel browsers telemetry stream ` +Stream live telemetry events from a browser session. + +| Flag | Description | +|------|-------------| +| `--categories ` | Comma-separated categories to include (e.g. `network,console,page,interaction,system`). | +| `--types ` | Comma-separated event types to include (e.g. `network_request,console_error`). | +| `--seq ` | Resume from this sequence number. `0` replays from retention start, positive values resume from that point. | +| `--output json`, `-o json` | Output NDJSON (one compact JSON object per line). Default is tab-separated `timestamp [category] type`. | + ### `kernel browsers ssh ` Open an interactive SSH session to a browser VM. Requires [websocat](https://github.com/vi/websocat) to be installed locally.