ClickGraph - A high-performance, stateless, read-only graph query service for ClickHouse, written in Rust, with Neo4j ecosystem compatibility - Cypher and Bolt Protocol 5.8 support. Now supports embedded mode with local writes, and exporting query results to external destinations, with Golang, Python bindings, in addition to native Rust.
Note: ClickGraph dev release is at beta quality for view-based graph analytics applications. Kindly raise an issue if you encounter any problem.
- Viewing ClickHouse databases (including external sources) as graph data with graph analytics capability brings another level of abstraction and boosts productivity with graph tools, and enables agentic GraphRAG support with local writes.
- Research shows relational analytics with columnar stores and vectorized execution engines like ClickHouse provide superior analytical performance and scalability to graph-native technologies, which usually leverage explicit adjacency representations and are more suitable for local-area graph traversals.
- View-based graph analytics offer the benefits of zero-ETL without the hassle of data migration and duplicate cost, yet better performance and scalability than most of the native graph analytics options.
- Neo4j Bolt protocol support gives access to the tools available based on the Bolt protocol.
- Hybrid remote query + local storage — Execute Cypher queries against a remote ClickHouse cluster from embedded mode, then store results locally in chdb as a subgraph for fast re-querying.
query_remote(),query_remote_graph(),store_subgraph()— ideal for GraphRAG context enrichment. Available in Rust, Python, and Go. See Embedded Mode. - Embedded write API for GraphRAG —
create_node(),create_edge(),upsert_node()with batch variants. AI agents can extract entities from documents, store them as graph data, and query with Cypher — all in-process. See Embedded Mode Write API. - Kuzu API parity —
Value::Date/Timestamp/UUIDtypes, query timing (compiling_time/execution_time),Database::in_memory(),Connection::set_query_timeout(), column type metadata, multi-format file import (CSV/Parquet/JSON). - DataFrame output —
result.get_as_df()(Pandas),result.get_as_arrow()(PyArrow),result.get_as_pl()(Polars) for Python data science workflows. - Tutorials and examples — 5 runnable Python examples covering quick start, DataFrames, write API, GraphRAG hybrid workflow, and export formats. See
examples/embedded/. - 1,599 unit tests — Up from 1,591, with hybrid E2E tests and comprehensive Value type coverage.
See CHANGELOG.md for complete release history.
- Cypher-to-SQL Translation - Industry-standard Cypher read syntax translated to optimized ClickHouse SQL
- Stateless Architecture - Offloads all query execution to ClickHouse; no extra datastore required
- Embedded Mode - In-process graph queries over Parquet/Iceberg/Delta/S3 via chdb; no ClickHouse server needed (
--features embedded) - LLM-powered schema discovery -
:discovercommand generates YAML schema from ClickHouse table metadata using Anthropic or OpenAI. - Variable-Length Paths - Recursive traversals with
*1..3syntax using ClickHouseWITH RECURSIVECTEs - Path Functions -
length(p),nodes(p),relationships(p)for path analysis - Parameterized Queries - Neo4j-compatible
$paramsyntax for SQL injection prevention - Query Cache - LRU caching with 10-100x speedup for repeated translations
- ClickHouse Functions - Pass-through via
ch.function_name()andchagg.aggregate()prefixes - GraphRAG structured output -
format: "Graph"returns deduplicated nodes, edges, and stats for graph visualization and RAG pipelines. - Query Metrics - Phase-by-phase timing via HTTP headers and structured logging
- ClickHouse cluster load balancing -
CLICKHOUSE_CLUSTERenv var auto-discovers and balances queries across cluster nodes. - Complex queries like LDBC SNB benchmark: 36/37 queries (97%) - Near-complete Social Network Benchmark coverage. See benchmark results for performance data on sf0.003, sf1 and sf10 datasets.
- Bolt Protocol v5.8 - Full Neo4j driver compatibility (cypher-shell, Neo4j Browser, graph-notebook)
- HTTP REST API - Complete query execution with parameters and aggregations
- Multi-Schema Support - Per-request schema selection via
USEclause, session parameter, or default - Authentication - Multiple auth schemes including basic auth
- Zero Migration - Map existing tables to graph format through YAML configuration
- Auto-Discovery -
auto_discover_columns: truequeries ClickHouse metadata automatically - Dynamic Schema Loading - Runtime schema registration via
POST /schemas/load - Composite Node IDs - Multi-column identity (e.g.,
node_id: [tenant_id, user_id])
ClickGraph runs as a lightweight stateless query translator alongside ClickHouse:
flowchart LR
Clients["Graph Clients<br/><br/>HTTP/REST<br/>Bolt Protocol<br/>(Neo4j tools)"]
ClickGraph["ClickGraph<br/><br/>Cypher -> SQL<br/>Translator<br/><br/>:8080 (HTTP)<br/>:7687 (Bolt)"]
ClickHouse["ClickHouse<br/><br/>Columnar Storage<br/>Query Engine"]
Clients -->|Cypher| ClickGraph
ClickGraph -->|SQL| ClickHouse
ClickHouse -->|Results| ClickGraph
ClickGraph -->|Results| Clients
style ClickGraph fill:#e1f5ff,stroke:#0288d1,stroke-width:3px
style ClickHouse fill:#fff3e0,stroke:#f57c00,stroke-width:3px
style Clients fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
Three-tier architecture: Graph clients -> ClickGraph translator -> ClickHouse database
New to ClickGraph? See the Getting Started Guide for a complete walkthrough.
# Pull the latest image
docker pull genezhang/clickgraph:latest
# Start ClickHouse only
docker-compose up -d clickhouse-service
# Run ClickGraph from Docker Hub image
docker run -d \
--name clickgraph \
--network clickgraph_default \
-p 8080:8080 \
-p 7687:7687 \
-e CLICKHOUSE_URL="http://clickhouse-service:8123" \
-e CLICKHOUSE_USER="test_user" \
-e CLICKHOUSE_PASSWORD="test_pass" \
-e GRAPH_CONFIG_PATH="/app/schemas/social_benchmark.yaml" \
-v $(pwd)/benchmarks/social_network/schemas:/app/schemas:ro \
genezhang/clickgraph:latestOr use docker-compose (uses published image by default):
docker-compose up -d# Prerequisites: Rust toolchain (1.85+) and Docker for ClickHouse
# 1. Clone and start ClickHouse
git clone https://github.com/genezhang/clickgraph
cd clickgraph
docker-compose up -d clickhouse-service
# 2. Build and run
cargo build --release
export CLICKHOUSE_URL="http://localhost:8123"
export CLICKHOUSE_USER="test_user"
export CLICKHOUSE_PASSWORD="test_pass"
export GRAPH_CONFIG_PATH="./benchmarks/social_network/schemas/social_benchmark.yaml"
cargo run --bin clickgraph
GRAPH_CONFIG_PATHis required. It tells ClickGraph how to map ClickHouse tables to graph nodes and edges.
# HTTP API
curl -X POST http://localhost:8080/query \
-H "Content-Type: application/json" \
-d '{"query": "MATCH (u:User) RETURN u.full_name LIMIT 5"}'
# Bolt protocol (cypher-shell, Neo4j Browser, or any Neo4j driver)
cypher-shell -a bolt://localhost:7687 -u neo4j -p passwordRun the included demo for interactive graph visualization:
cd demos/neo4j-browser && bash setup.shThen open http://localhost:7474 and connect to bolt://localhost:7687.
See demos/neo4j-browser/README.md for details.
ClickGraph implements apoc.meta.schema() and Neo4j-compatible schema procedures, enabling AI assistants (Claude, etc.) to discover your graph structure via MCP servers like @anthropic-ai/mcp-server-neo4j and @neo4j/mcp-neo4j.
See the MCP Setup Guide for configuration details.
cargo build --release -p clickgraph-client
./target/release/clickgraph-client # connects to http://localhost:8080Map your tables to a graph with YAML:
views:
- name: social_network
nodes:
- label: user
table: users
database: mydb
node_id: user_id
property_mappings:
name: full_name
edges:
- type: follows
table: user_follows
database: mydb
from_node: user
to_node: user
from_id: follower_id
to_id: followed_idMATCH (u:user)-[:follows]->(friend:user)
WHERE u.name = 'Alice'
RETURN friend.name- Getting Started - Setup walkthrough and first queries
- Features Overview - Comprehensive feature list
- API Documentation - HTTP REST API and Bolt protocol
- Configuration Guide - Server configuration and CLI options
- Wiki - Comprehensive guides: Cypher Reference, Schema Basics, Graph-Notebook, Neo4j Tools
- Examples - Quick Start | E-commerce Analytics
- Dev Quick Start - 30-second workflow for contributors
Current Version: v0.6.5-dev
- Rust Unit Tests: 1,588 passing (100%)
- Integration Tests: 3,068 passing (108 environment-dependent)
- LDBC SNB: 36/37 queries passing (97%)
- Benchmarks: 14/14 passing (100%)
- E2E Tests: Bolt 4/4, Cache 5/5 (100%)
- Read-Only Engine: Write operations not supported by design
- Anonymous Nodes: Use named nodes for better SQL generation
See STATUS.md and KNOWN_ISSUES.md for details.
| Phase | Version | Status |
|---|---|---|
| Phase 1 | v0.4.0 | Complete - Query cache, parameters, Bolt protocol |
| Phase 2 | v0.5.0 | Complete - Multi-tenancy, RBAC, auto-schema discovery |
| Phase 2.5-2.6 | v0.5.2-v0.5.3 | Complete - Schema variations, Cypher functions |
| Phase 3 | v0.6.3 | Complete - WITH redesign, GraphRAG, LDBC SNB, MCP |
| Phase 4 | v0.6.x | Next - user-requested features, advanced optimizations |
See ROADMAP.md for detailed feature tracking.
Contributions welcome! See DEV_QUICK_START.md to get started and DEVELOPMENT_PROCESS.md for the full workflow.
ClickGraph is licensed under the Apache License, Version 2.0. See the LICENSE file for details.
This project is developed on a forked repo of Brahmand with zero-ETL view-based graph querying, Neo4j ecosystem compatibility and enterprise deployment capabilities.
