zradar

OpenTelemetry-native observability for AI agents, LLM apps, and multi-step workflows.

zradar helps developers understand what their agents are doing, why they fail, how much they cost, and where latency appears. It ingests standard OTLP telemetry and stores high-volume traces, metrics, and logs in a Parquet-first architecture built for fast analytical queries.

flowchart LR
    app["AI app or agent framework"] --> otel["OpenTelemetry<br/>spans · metrics · logs"]
    otel --> zradar["zradar"]
    zradar --> traces["Agent traces<br/>workflow timelines"]
    zradar --> llm["LLM tokens<br/>cost · latency"]
    zradar --> tools["Tool calls<br/>retrieval visibility"]
    zradar --> errors["Error and performance<br/>breakdowns"]
    zradar --> parquet["Parquet-backed<br/>telemetry storage"]

Why developers try zradar

Use standards you already know: send OpenTelemetry data instead of adopting a proprietary SDK.
See full agent execution paths: connect agent spans, LLM calls, tools, retrievers, chains, and evaluations in one trace model.
Debug faster: inspect latency, errors, status codes, and span metadata across traces, logs, and metrics.
Control LLM cost: track token usage and estimated cost by model, agent, service, and project.
Keep telemetry affordable: store high-volume data in Parquet while PostgreSQL handles metadata and control-plane state.
Run locally first: develop against local PostgreSQL and local Parquet files, then add S3-compatible storage when needed.

What you can observe

Agent workflows

flowchart TD
    user["User request"] --> session["Agent session"]
    session --> plan["Planning step"]
    session --> rag["Retrieval / RAG"]
    session --> tool["Tool execution"]
    session --> llm["LLM generation"]
    session --> eval["Evaluation / guardrail"]
    plan --> timeline["Unified trace timeline"]
    rag --> timeline
    tool --> timeline
    llm --> timeline
    eval --> timeline

zradar is designed for the shape of AI applications, not just traditional request/response services.

LLM usage and cost

Track model calls, prompt tokens, completion tokens, total tokens, latency, status, and cost signals so you can answer questions like:

Which agents are most expensive?
Which models are slowest?
Where are token spikes coming from?
Which tools or chains fail most often?

Logs, metrics, and traces together

zradar supports all three telemetry signals through OTLP ingestion and query APIs, making it easier to correlate behavior across your agent stack.

Architecture

zradar uses PostgreSQL for metadata and control-plane state, and Parquet for telemetry data.

flowchart TD
    otlp["OTLP applications<br/>traces · metrics · logs"] --> server["zradar server<br/>auth · rate limits · circuit breakers"]
    server --> writer["Parquet writer<br/>buffer · bloom filters · atomic writes"]
    writer --> parquet["Parquet storage<br/>local disk or S3"]
    writer --> pg["PostgreSQL metadata<br/>file_list · stream_stats · settings"]
    parquet --> cache["DiskCache / MemoryCache"]
    pg --> reader["Query service"]
    cache --> reader
    reader --> df["DataFusion ListingTable"]
    df --> api["Analytics and query APIs"]
    parquet --> jobs["Retention · FileMover · compaction"]
    jobs --> pg

The Parquet path includes batching, atomic writes, bloom filters, DataFusion ListingTable queries, memory/disk caching, retention, and compaction for small-file control.

Core capabilities

OTLP ingestion: traces, metrics, and logs.
Agent-aware span model: agent, generation, tool, chain, retriever, evaluator, embedding, and guardrail spans.
Analytics APIs: service analytics, top endpoints, error breakdowns, LLM analytics, and agent analytics.
Project settings: per-project ingestion and storage controls.
Retention controls: organization defaults with project overrides.
Resilience: per-project rate limiting, backpressure, health readiness, and circuit breakers.
Storage lifecycle: local Parquet, optional S3 movement, cache layers, and compaction.

Quick start

Use the repository make targets for local development.

make help
make dev
make health

Run checks:

make fmt
make check
make test

Run functional tests:

make functional_tests_fast

Configuration

Start with config.toml.example and tune storage settings for your environment.

Common areas to review:

PostgreSQL connection
OTLP ports and API keys
local Parquet data directory
S3-compatible storage settings
retention window
write buffering and compaction
circuit breaker thresholds

Project layout

crates/applications/zradar-server     Server binary
crates/services/api-optel             OTLP ingestion services
crates/services/api                   Query and admin APIs
crates/core/zradar-parquet            Parquet storage and query engine
crates/core/zradar-models             Shared models and config
crates/plugins/zradar-plugin-postgres PostgreSQL repositories
crates/plugins/zradar-plugin-s3       S3 storage plugin
test_functional                       End-to-end scenarios

When zradar is a good fit

Try zradar if you are building:

AI agents with multi-step execution paths
LLM applications with token and cost concerns
RAG systems where retrieval and generation need correlation
tool-using agents where failures happen across many spans
observability infrastructure that should stay OpenTelemetry-native
telemetry systems where Parquet storage is preferable to keeping everything in an OLTP database

Documentation

config.toml.example for configuration
examples/README.md for instrumentation examples
test_functional/README.md for functional testing
AGENTS.md and CODING_GUIDELINES.md for repository contribution guidance

Status

zradar is under active development. The current architecture is Parquet-first for telemetry with PostgreSQL metadata and control-plane state.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.cargo		.cargo
.cursor		.cursor
config		config
crates		crates
data		data
docs		docs
ee		ee
examples		examples
scripts		scripts
test_functional		test_functional
.cursorrules		.cursorrules
.dockerignore		.dockerignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CODING_GUIDELINES.md		CODING_GUIDELINES.md
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
Dockerfile.dev		Dockerfile.dev
LICENSE		LICENSE
LICENSE_EE		LICENSE_EE
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
config.server.toml		config.server.toml
config.test.toml		config.test.toml
config.toml.example		config.toml.example
docker-compose.test.yml		docker-compose.test.yml
docker-compose.yml		docker-compose.yml
docker-start.sh		docker-start.sh
env.example		env.example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

zradar

Why developers try zradar

What you can observe

Agent workflows

LLM usage and cost

Logs, metrics, and traces together

Architecture

Core capabilities

Quick start

Configuration

Project layout

When zradar is a good fit

Documentation

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

zradar

Why developers try zradar

What you can observe

Agent workflows

LLM usage and cost

Logs, metrics, and traces together

Architecture

Core capabilities

Quick start

Configuration

Project layout

When zradar is a good fit

Documentation

Status

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages