diff --git a/images/agentic-data-stack/01-architecture.webp b/images/agentic-data-stack/01-architecture.webp new file mode 100644 index 000000000..00ac8695f Binary files /dev/null and b/images/agentic-data-stack/01-architecture.webp differ diff --git a/images/agentic-data-stack/prompt-chat.png b/images/agentic-data-stack/prompt-chat.png new file mode 100644 index 000000000..3beaf972d Binary files /dev/null and b/images/agentic-data-stack/prompt-chat.png differ diff --git a/images/agentic-data-stack/select-clickhouse-local.png b/images/agentic-data-stack/select-clickhouse-local.png new file mode 100644 index 000000000..fb7073602 Binary files /dev/null and b/images/agentic-data-stack/select-clickhouse-local.png differ diff --git a/images/agentic-data-stack/select-model.png b/images/agentic-data-stack/select-model.png new file mode 100644 index 000000000..2f692a3cd Binary files /dev/null and b/images/agentic-data-stack/select-model.png differ diff --git a/images/agentic-data-stack/set-api-key-modal.png b/images/agentic-data-stack/set-api-key-modal.png new file mode 100644 index 000000000..93fe933fd Binary files /dev/null and b/images/agentic-data-stack/set-api-key-modal.png differ diff --git a/images/agentic-data-stack/set-api-key.png b/images/agentic-data-stack/set-api-key.png new file mode 100644 index 000000000..b16ab7877 Binary files /dev/null and b/images/agentic-data-stack/set-api-key.png differ diff --git a/products/agentic-data-stack/components/langfuse.mdx b/products/agentic-data-stack/components/langfuse.mdx new file mode 100644 index 000000000..4644a4406 --- /dev/null +++ b/products/agentic-data-stack/components/langfuse.mdx @@ -0,0 +1,33 @@ +--- +sidebarTitle: 'Langfuse' +title: 'Langfuse in the Agentic Data Stack' +description: 'How Langfuse provides observability and tracing for the agentic data stack' +doc_type: 'guide' +keywords: ['agentic data stack', 'Langfuse', 'observability', 'tracing', 'LLM'] +--- + +{/* DOC-736. Thin stack-context page; links to canonical Langfuse docs rather than duplicating them. */} + + +**Draft.** Not yet editorially reviewed. + + +Langfuse is the **observability layer** of the [Agentic Data Stack](/products/agentic-data-stack/overview). It records what the agent did in LibreChat, so you can debug it, measure quality, and track cost. Built on OpenTelemetry, Langfuse runs on ClickHouse. + +## Trace and inspect every run {#trace} + +Every conversation is captured as a [Langfuse trace](https://langfuse.com/docs/observability/overview): the prompts, each tool call (including the SQL the agent ran), and the response. Each trace also records token usage, cost, and latency. Open a run to see what the agent did and where it failed. Sort by user and session to see who is spending the most. + +## Score output quality {#evals} + +Model output is nondeterministic, so Langfuse lets you measure quality instead of guessing at it. Score runs with [human annotation](https://langfuse.com/docs/evaluation/evaluation-methods/annotation) or an automated [LLM-as-a-judge](https://langfuse.com/docs/evaluation/evaluation-methods/llm-as-a-judge) evaluator, for example to flag when an answer is wrong or an analysis is unhelpful. + +## In the bundle {#in-the-bundle} + +The bundle wires LibreChat to Langfuse, so every run is traced automatically, with no instrumentation to add. Langfuse stores its data in the stack's ClickHouse. To run it as part of the stack, see the [Docker setup guide](/products/agentic-data-stack/docker-setup). + +To send traces from a standalone LibreChat instance, or to use a regional or HIPAA Langfuse endpoint, see the [Langfuse companion guide](https://langfuse.com/integrations/agentic-data-stack). For Langfuse on ClickHouse generally, see the [Langfuse overview](/products/cloud/features/ai-ml/langfuse). + + +**Prefer a managed experience?** [Langfuse Cloud](https://cloud.langfuse.com) is a fully managed deployment powered by a managed ClickHouse cluster — no infrastructure to run. + diff --git a/products/agentic-data-stack/components/librechat.mdx b/products/agentic-data-stack/components/librechat.mdx new file mode 100644 index 000000000..3654eccf2 --- /dev/null +++ b/products/agentic-data-stack/components/librechat.mdx @@ -0,0 +1,59 @@ +--- +sidebarTitle: 'LibreChat' +title: 'LibreChat in the Agentic Data Stack' +description: 'How LibreChat serves as the chat and agent front-end of the agentic data stack' +doc_type: 'guide' +keywords: ['agentic data stack', 'LibreChat', 'chat', 'front-end', 'MCP'] +--- + +{/* DOC-735. Thin stack-context page; links to canonical LibreChat docs rather than duplicating them. */} + + +**Draft.** Not yet editorially reviewed. + + +LibreChat is the **chat and agent front-end** of the [Agentic Data Stack](/products/agentic-data-stack/overview). Instead of writing SQL, a user asks a question in plain language and an agent answers it. + +The agent works through the ClickHouse MCP server to inspect your databases and tables, run read-only queries, and build an answer from the results. The bundle wires this up for you, so LibreChat queries your data from the first sign-in. Stand up the full stack with the [Docker setup guide](/products/agentic-data-stack/docker-setup). + +## Build an agent over your data {#build-an-agent} + +Build a reusable agent for a recurring question about your data. Two choices make it ClickHouse-aware: give it **Instructions** that describe your schema and preferred tables, and add the **ClickHouse-Local** MCP server so it can list databases and tables and run read-only queries. For building, reusing, and sharing agents, see [LibreChat's Agent Builder](https://www.librechat.ai/docs/features/agents). + +## Connect more MCP servers {#connect-more-mcp-servers} + +The agent isn't limited to ClickHouse. Add any MCP server through [LibreChat's MCP settings](https://www.librechat.ai/docs/features/mcp) so one chat can reach other databases, internal APIs, or SaaS tools. + +## Generate charts and visualizations {#generate-charts} + +Ask the agent to visualize your results, for example "Chart the top 10 products by revenue," and it returns an interactive chart you can explore and share. Visualizations use [LibreChat Artifacts](https://www.librechat.ai/docs/features/artifacts), enabled per agent. + +## Run code with the code interpreter {#code-interpreter} + +Beyond SQL, the agent can run code in a secure sandbox to transform or analyze your results, such as turning a query into a file or a computed metric. This uses [LibreChat's Code Interpreter](https://www.librechat.ai/docs/features/code_interpreter). + +## Run long queries in the background {#run-in-background} + +A query can take a while, and you don't have to wait. With [LibreChat's resumable streams](https://www.librechat.ai/docs/features/resumable_streams), start a generation, switch to another conversation, and come back to the finished response. + +## Share an analysis as a read-only link {#share-an-analysis} + +Share a conversation as a read-only [shareable link](https://www.librechat.ai/docs/features/shareable_links) so others can review an analysis without rerunning it. The shared view includes the tool calls and the SQL behind each answer, giving a clear chain of custody for how a result was produced. + +## Control access to MCP servers {#control-access} + +In a team deployment, [role-based access control](https://www.librechat.ai/docs/features/access_control) governs who can use, create, and share MCP servers and agents, and at what level (Viewer, Editor, or Owner). + +## In the bundle {#in-the-bundle} + +LibreChat is preconfigured through `librechat.yaml`, so it works out of the box: + +- The [ClickHouse MCP server](/products/agentic-data-stack/components/mcp-server) is registered as a tool source, so the agent can explore and query ClickHouse with no extra setup. +- Every conversation is traced to [Langfuse](/products/agentic-data-stack/components/langfuse) for observability, capturing prompts, tool calls, responses, cost, and latency. +- The [Admin Panel](https://github.com/ClickHouse/librechat-admin-panel) (port 3081) is a browser-based UI for changing this configuration (endpoints, MCP servers, and agent settings) without editing `librechat.yaml` by hand. + +To connect the ClickHouse MCP server to a standalone LibreChat instance, see the canonical guide: [Using ClickHouse MCP server with LibreChat](/core/guides/use-cases/ai-ml/MCP/librechat). For LibreChat's full feature documentation, see the [LibreChat documentation](https://www.librechat.ai/docs). + + +**Prefer a managed experience?** ClickHouse Cloud offers [ClickHouse Agents](/products/cloud/features/ai-ml/agents) (Beta) — a hosted, no-setup agent experience built on the same foundation, with the agent-building features available through the Cloud console. + diff --git a/products/agentic-data-stack/components/mcp-server.mdx b/products/agentic-data-stack/components/mcp-server.mdx new file mode 100644 index 000000000..15965d2e6 --- /dev/null +++ b/products/agentic-data-stack/components/mcp-server.mdx @@ -0,0 +1,30 @@ +--- +sidebarTitle: 'ClickHouse MCP server' +title: 'The ClickHouse MCP server in the stack' +description: 'How the ClickHouse MCP server provides the open-standard access layer for the agentic data stack' +doc_type: 'guide' +keywords: ['agentic data stack', 'MCP', 'Model Context Protocol', 'ClickHouse MCP server'] +--- + +{/* DOC-737. Thin stack-context page; links to the MCP catalog and mcp-clickhouse repo rather than re-documenting the server. */} + + +**Draft.** Not yet editorially reviewed. + + +The ClickHouse MCP server is the **open-standard access layer** of the [Agentic Data Stack](/products/agentic-data-stack/overview): it exposes ClickHouse to the chat front-end over the Model Context Protocol, so the agent can explore your data and run read-only queries against it. + +In the bundle, the open-source server runs alongside ClickHouse and is preconfigured in LibreChat. It exposes three tools to the agent: + +- **List databases** — enumerate the databases in the connected ClickHouse instance. +- **List tables** — inspect the tables and their schemas within a database. +- **Run SELECT queries** — execute read-only queries and return the results. + +To run it as part of the stack, see the [Docker setup guide](/products/agentic-data-stack/docker-setup). To use the ClickHouse MCP server with other clients and agent frameworks, or to read its source and configuration: + +- [MCP integration catalog](/core/guides/use-cases/ai-ml/MCP): clients and agent-library guides. +- [github.com/ClickHouse/mcp-clickhouse](https://github.com/ClickHouse/mcp-clickhouse): source and configuration. + + +**Prefer a managed MCP server?** ClickHouse Cloud offers a fully managed [Remote MCP server](/products/cloud/features/ai-ml/remote-mcp) — zero install, OAuth authentication, and a richer tool set that adds management operations (services, billing, ClickPipes, backups) on top of the read-only tools above. See its [comparison with the open-source server](/products/cloud/features/ai-ml/remote-mcp#remote-vs-oss). + diff --git a/products/agentic-data-stack/docker-setup.mdx b/products/agentic-data-stack/docker-setup.mdx new file mode 100644 index 000000000..b38e5d94a --- /dev/null +++ b/products/agentic-data-stack/docker-setup.mdx @@ -0,0 +1,145 @@ +--- +sidebarTitle: 'Docker setup' +title: 'Set up the Agentic Data Stack with Docker Compose' +description: 'Run the full agentic data stack (ClickHouse, LibreChat, the MCP server, and Langfuse) via Docker Compose' +doc_type: 'guide' +keywords: ['agentic data stack', 'docker compose', 'setup', 'LibreChat', 'MCP', 'Langfuse'] +--- + +import { Image } from "/snippets/components/Image.jsx"; + +Run the complete Agentic Data Stack locally with Docker Compose to ask questions of your data from the first login. One `docker compose up` command brings up [LibreChat](/products/agentic-data-stack/components/librechat), the [ClickHouse MCP server](/products/agentic-data-stack/components/mcp-server), [ClickHouse](/core/get-started/setup/install), and [Langfuse](/products/agentic-data-stack/components/langfuse) for observability. + +## Prerequisites {#prerequisites} + +- **Docker** with the Compose plugin (Compose v2 or later). +- **Git**, to clone the repository. +- A model provider API key (such as OpenAI, Anthropic, or Google). The agent needs a model to answer questions, so supply a key during setup or add one in the LibreChat UI before your first chat. + +## Stand up the stack {#stand-up-the-stack} + + + +```bash +git clone https://github.com/ClickHouse/agentic-data-stack +cd agentic-data-stack +``` + +The repository ships with a top-level `docker-compose.yml`, so the whole stack comes up with a single command. See [Architecture](#how-it-is-wired) for the full list of services. + + + +```bash +./scripts/prepare-demo.sh +``` + +This generates a `.env` file with credentials for every service, then offers an interactive menu to configure API keys for a chosen provider. You can also set these keys directly in the `.env` file. Any provider you skip stays set to `user_provided`, so you can add your own key in the LibreChat UI instead. + +On first startup, the stack creates an admin user from `.env`. The default login is `admin@admin.com` / `password`. + + +Run `generate-env.sh` with these variables before `prepare-demo.sh`: + +```bash +USER_EMAIL="you@example.com" USER_PASSWORD="supersecret" USER_NAME="YourName" ./scripts/generate-env.sh +``` + +`prepare-demo.sh` then sees the existing `.env` and goes straight to API-key configuration. + + + + +```bash +docker compose up -d +``` + +Startup is ordered automatically. LibreChat boots only after the MCP server is healthy, so its connection to ClickHouse is ready on first load. + + + +Once the stack is up, the services are available in your browser: + +- **LibreChat** (chat UI) — [http://localhost:3080](http://localhost:3080) +- **Langfuse** (observability) — [http://localhost:3000](http://localhost:3000) +- **Admin Panel** (browser-based LibreChat configuration) — [http://localhost:3081](http://localhost:3081) +- **MinIO console** (object storage; credentials in `.env`) — [http://localhost:9091](http://localhost:9091) + +Sign in to LibreChat with the admin credentials from your `.env` file. + + + +A model is selected by default. If you want to change it, open the model selector and choose the one you want to use. + +The LibreChat model selector in the top-left corner, showing the default model with a Select a model tooltip + +If you didn't set a provider key during setup, add one in the UI. + + +Open the model selector and click **Set API Key** next to the provider. + +LibreChat model selector with the Set API Key button next to the Anthropic provider + +Paste your key in the dialog and click **Submit**. You can set an expiration, or keep the key from expiring. + +The Set API Key for Anthropic dialog with an expiration dropdown, a Key field, and Submit and Revoke buttons + + + + +The stack preconfigures its MCP servers in LibreChat's `librechat.yaml`. In the message composer, click **MCP Servers** and select **ClickHouse-Local**. + +Select **ClickHouse-Cloud** instead to use a ClickHouse Cloud service. + +{/* TODO: guide users (or link out to the Remote MCP docs at /cloud/features/ai-ml/remote-mcp) on connecting ClickHouse-Cloud — it requires an OAuth flow, unlike the bundled local server. */} + +The MCP Servers menu open in the LibreChat composer, showing ClickHouse-Local with a green connected dot and ClickHouse-Cloud + + + +For example: + +> What databases and tables are available, and how many rows are in the largest table? + +The agent uses the MCP server's tools to list databases and tables, run read-only queries against ClickHouse, and build an answer from the results. You don't need to write SQL. + +LibreChat answering the example question by calling the ClickHouse-Local MCP tools and listing the available databases and tables + + + +## Stop or reset the stack {#stop-or-reset} + +Stop the services without deleting anything: + +```bash +docker compose down +``` + +To tear down all containers and wipe every volume for a clean start, use the stack's reset script: + +```bash +./scripts/reset-all.sh +``` + +## Architecture {#how-it-is-wired} + +`docker-compose.yml` is a thin entrypoint that includes four Compose files: + +| Compose file | Defines | +|---|---| +| `langfuse-compose.yml` | Langfuse and its backing services (ClickHouse, PostgreSQL, Redis, MinIO) | +| `clickhouse-mcp-compose.yml` | The ClickHouse MCP server | +| `librechat-compose.yml` | LibreChat and its backing services (MongoDB, Meilisearch, pgvector, RAG API) | +| `admin-panel-compose.yml` | The LibreChat Admin Panel | + +Two details make the single-command startup work: + +- **Health checks and start order.** Compose uses health checks to sequence startup. The MCP server waits for ClickHouse, and LibreChat waits for the MCP server. +- **Shared environment file.** The `.env` holds each service's credentials and connection values, set consistently so the services can reach each other. For example, the MCP server connects to ClickHouse with the ClickHouse credentials from `.env`. LibreChat is given `LANGFUSE_PUBLIC_KEY`, `LANGFUSE_SECRET_KEY`, and `LANGFUSE_BASE_URL`, so every run is traced to Langfuse out of the box. + +ClickHouse plays two roles in the stack: it's both Langfuse's storage backend and the database your agent queries through the MCP server. + +## Next steps {#next-steps} + +- Learn what each piece does in the stack: [ClickHouse MCP server](/products/agentic-data-stack/components/mcp-server), [LibreChat](/products/agentic-data-stack/components/librechat), and [Langfuse](/products/agentic-data-stack/components/langfuse). +- See the [overview](/products/agentic-data-stack/overview) for how the stack fits together. +- To try the stack against public datasets without installing anything, use [AgentHouse](https://llm.clickhouse.com), the hosted demo. diff --git a/products/agentic-data-stack/navigation.json b/products/agentic-data-stack/navigation.json index 5bddf3b3c..074ac913b 100644 --- a/products/agentic-data-stack/navigation.json +++ b/products/agentic-data-stack/navigation.json @@ -3,9 +3,18 @@ "icon": "/images/icons/icon-agentic-data-stack.svg", "groups": [ { - "group": "Overview", + "group": "Get started", "pages": [ - "products/agentic-data-stack/overview" + "products/agentic-data-stack/overview", + "products/agentic-data-stack/docker-setup" + ] + }, + { + "group": "Components", + "pages": [ + "products/agentic-data-stack/components/librechat", + "products/agentic-data-stack/components/mcp-server", + "products/agentic-data-stack/components/langfuse" ] } ] diff --git a/products/agentic-data-stack/overview.mdx b/products/agentic-data-stack/overview.mdx index 01e99767d..a1eea4257 100644 --- a/products/agentic-data-stack/overview.mdx +++ b/products/agentic-data-stack/overview.mdx @@ -1,6 +1,58 @@ --- +sidebarTitle: 'Overview' title: 'Agentic Data Stack' -description: 'Overview of the Agentic Data Stack' +description: 'Agentic analytics on ClickHouse: fully managed in ClickHouse Cloud, or self-hosted as an open-source stack of ClickHouse, the MCP server, LibreChat, and Langfuse.' +doc_type: 'landing-page' +keywords: ['agentic data stack', 'agentic analytics', 'ClickHouse', 'MCP', 'LibreChat', 'Langfuse', 'AI'] --- -Overview of the Agentic Data Stack. \ No newline at end of file +import { Image } from "/snippets/components/Image.jsx"; + +The easiest way to run agentic analytics on ClickHouse is [ClickHouse Agents](/products/cloud/features/ai-ml/agents) in ClickHouse Cloud: fully managed, with no infrastructure to run. Users ask questions in plain language, and an AI agent answers by querying the database directly. + +To self-host, the **Agentic Data Stack** is a composable open-source stack. You run it yourself, connect your own models, and keep your chat and data in your own environment. It's built from [ClickHouse](/core/get-started/setup/install), the [ClickHouse MCP server](/products/agentic-data-stack/components/mcp-server), [LibreChat](/products/agentic-data-stack/components/librechat), and [Langfuse](/products/agentic-data-stack/components/langfuse). + +## What is agentic analytics? {#what-is-agentic-analytics} + +In agentic analytics, the model grounds its answers by running queries against your data. Given a question, the agent inspects the available databases and tables, decides which queries to run, executes them against ClickHouse, and builds an answer from the results. It can refine a query, run a follow-up, or chain several steps together. When a query fails or returns something unexpected, it adjusts and tries again instead of stopping. + +## What you can do {#what-you-can-do} + +- **Ask questions in natural language** and get answers drawn from your own data. +- **Build agents with no code:** give an agent instructions and tools, then reuse it. +- **Share agents and conversations** as read-only links, so others can trace the queries behind an answer. +- **Generate interactive charts and visualizations** from query results inside a conversation. +- **Evaluate and improve answers:** score responses in Langfuse with human review or an LLM judge, then refine your prompts and agents. + +## How the stack fits together {#architecture} + +Agentic Data Stack architecture: users interact with LibreChat, which connects to an LLM, to ClickHouse through the MCP server, and to Langfuse for tracing + +A user asks a question in LibreChat. The model plans a response and, through the MCP server, calls tools to explore and query ClickHouse. Results flow back, and the agent composes an answer. Langfuse, built on OpenTelemetry, records each run from prompt to tool call to response, lets you score outputs automatically or with human review, and tracks quality, cost, and latency. + +The ClickHouse MCP server is built on the [Model Context Protocol](https://modelcontextprotocol.io/), an open standard, so it works with any MCP-compatible client or agent framework, not only LibreChat. See the [MCP guides](/core/guides/use-cases/ai-ml/MCP) for clients and agent libraries. + +## Components {#components} + +| Component | Role | Learn more | +|-----------|------|------------| +| ClickHouse | The analytical engine the agent queries | [Get started with ClickHouse](/core/get-started/setup/install) | +| ClickHouse MCP server | The open standard that exposes ClickHouse to the agent as tools | [MCP server](/products/agentic-data-stack/components/mcp-server) | +| LibreChat | The chat and agent front-end users interact with | [LibreChat](/products/agentic-data-stack/components/librechat) | +| Langfuse | Observability for every prompt, tool call, and response | [Langfuse](/products/agentic-data-stack/components/langfuse) | + +## Get started {#get-started} + +There are two ways to run agentic analytics on ClickHouse: + +- **Managed (ClickHouse Cloud):** the fastest path, with no setup. [ClickHouse Agents](/products/cloud/features/ai-ml/agents) provides hosted chat and agents over your data. The individual pieces are also available managed: the [Remote MCP server](/products/cloud/features/ai-ml/remote-mcp) and [Langfuse Cloud](https://cloud.langfuse.com). +- **Self-hosted (open source):** run the full stack yourself with [Docker Compose](/products/agentic-data-stack/docker-setup), connecting your own models and keeping your data in your environment. + +To try the stack against public datasets without installing anything, use [AgentHouse](https://llm.clickhouse.com), the hosted demo. + +## Related {#related} + +Other open-source AI capabilities on ClickHouse: + +- [AI-powered SQL generation](/core/guides/use-cases/ai-ml/ai-powered-sql-generation): natural-language to SQL in ClickHouse Client and clickhouse-local +- [Vector search with QBit](/core/guides/use-cases/ai-ml/vector-search): runtime-tunable vector search