Skip to content

genezhang/clickgraph

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,188 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

ClickGraph

ClickGraph - A high-performance, stateless, read-only graph query service for ClickHouse, written in Rust, with Neo4j ecosystem compatibility - Cypher and Bolt Protocol 5.8 support. Now supports embedded mode with local writes, and exporting query results to external destinations, with Golang, Python bindings, in addition to native Rust.

Note: ClickGraph dev release is at beta quality for view-based graph analytics applications. Kindly raise an issue if you encounter any problem.


Motivation and Rationale

  • Viewing ClickHouse databases (including external sources) as graph data with graph analytics capability brings another level of abstraction and boosts productivity with graph tools, and enables agentic GraphRAG support with local writes.
  • Research shows relational analytics with columnar stores and vectorized execution engines like ClickHouse provide superior analytical performance and scalability to graph-native technologies, which usually leverage explicit adjacency representations and are more suitable for local-area graph traversals.
  • View-based graph analytics offer the benefits of zero-ETL without the hassle of data migration and duplicate cost, yet better performance and scalability than most of the native graph analytics options.
  • Neo4j Bolt protocol support gives access to the tools available based on the Bolt protocol.

What's New in v0.6.5-dev

  • Hybrid remote query + local storage — Execute Cypher queries against a remote ClickHouse cluster from embedded mode, then store results locally in chdb as a subgraph for fast re-querying. query_remote(), query_remote_graph(), store_subgraph() — ideal for GraphRAG context enrichment. Available in Rust, Python, and Go. See Embedded Mode.
  • Embedded write API for GraphRAGcreate_node(), create_edge(), upsert_node() with batch variants. AI agents can extract entities from documents, store them as graph data, and query with Cypher — all in-process. See Embedded Mode Write API.
  • Kuzu API parityValue::Date/Timestamp/UUID types, query timing (compiling_time/execution_time), Database::in_memory(), Connection::set_query_timeout(), column type metadata, multi-format file import (CSV/Parquet/JSON).
  • DataFrame outputresult.get_as_df() (Pandas), result.get_as_arrow() (PyArrow), result.get_as_pl() (Polars) for Python data science workflows.
  • Tutorials and examples — 5 runnable Python examples covering quick start, DataFrames, write API, GraphRAG hybrid workflow, and export formats. See examples/embedded/.
  • 1,599 unit tests — Up from 1,591, with hybrid E2E tests and comprehensive Value type coverage.

See CHANGELOG.md for complete release history.


Features

Core Capabilities

  • Cypher-to-SQL Translation - Industry-standard Cypher read syntax translated to optimized ClickHouse SQL
  • Stateless Architecture - Offloads all query execution to ClickHouse; no extra datastore required
  • Embedded Mode - In-process graph queries over Parquet/Iceberg/Delta/S3 via chdb; no ClickHouse server needed (--features embedded)
  • LLM-powered schema discovery - :discover command generates YAML schema from ClickHouse table metadata using Anthropic or OpenAI.
  • Variable-Length Paths - Recursive traversals with *1..3 syntax using ClickHouse WITH RECURSIVE CTEs
  • Path Functions - length(p), nodes(p), relationships(p) for path analysis
  • Parameterized Queries - Neo4j-compatible $param syntax for SQL injection prevention
  • Query Cache - LRU caching with 10-100x speedup for repeated translations
  • ClickHouse Functions - Pass-through via ch.function_name() and chagg.aggregate() prefixes
  • GraphRAG structured output - format: "Graph" returns deduplicated nodes, edges, and stats for graph visualization and RAG pipelines.
  • Query Metrics - Phase-by-phase timing via HTTP headers and structured logging
  • ClickHouse cluster load balancing - CLICKHOUSE_CLUSTER env var auto-discovers and balances queries across cluster nodes.
  • Complex queries like LDBC SNB benchmark: 36/37 queries (97%) - Near-complete Social Network Benchmark coverage. See benchmark results for performance data on sf0.003, sf1 and sf10 datasets.

Neo4j Ecosystem Compatibility

  • Bolt Protocol v5.8 - Full Neo4j driver compatibility (cypher-shell, Neo4j Browser, graph-notebook)
  • HTTP REST API - Complete query execution with parameters and aggregations
  • Multi-Schema Support - Per-request schema selection via USE clause, session parameter, or default
  • Authentication - Multiple auth schemes including basic auth

View-Based Graph Model

  • Zero Migration - Map existing tables to graph format through YAML configuration
  • Auto-Discovery - auto_discover_columns: true queries ClickHouse metadata automatically
  • Dynamic Schema Loading - Runtime schema registration via POST /schemas/load
  • Composite Node IDs - Multi-column identity (e.g., node_id: [tenant_id, user_id])

Architecture

ClickGraph runs as a lightweight stateless query translator alongside ClickHouse:

flowchart LR
    Clients["Graph Clients<br/><br/>HTTP/REST<br/>Bolt Protocol<br/>(Neo4j tools)"]

    ClickGraph["ClickGraph<br/><br/>Cypher -> SQL<br/>Translator<br/><br/>:8080 (HTTP)<br/>:7687 (Bolt)"]

    ClickHouse["ClickHouse<br/><br/>Columnar Storage<br/>Query Engine"]

    Clients -->|Cypher| ClickGraph
    ClickGraph -->|SQL| ClickHouse
    ClickHouse -->|Results| ClickGraph
    ClickGraph -->|Results| Clients

    style ClickGraph fill:#e1f5ff,stroke:#0288d1,stroke-width:3px
    style ClickHouse fill:#fff3e0,stroke:#f57c00,stroke-width:3px
    style Clients fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
Loading

Three-tier architecture: Graph clients -> ClickGraph translator -> ClickHouse database


Quick Start

New to ClickGraph? See the Getting Started Guide for a complete walkthrough.

Option 1: Docker (Recommended)

# Pull the latest image
docker pull genezhang/clickgraph:latest

# Start ClickHouse only
docker-compose up -d clickhouse-service

# Run ClickGraph from Docker Hub image
docker run -d \
  --name clickgraph \
  --network clickgraph_default \
  -p 8080:8080 \
  -p 7687:7687 \
  -e CLICKHOUSE_URL="http://clickhouse-service:8123" \
  -e CLICKHOUSE_USER="test_user" \
  -e CLICKHOUSE_PASSWORD="test_pass" \
  -e GRAPH_CONFIG_PATH="/app/schemas/social_benchmark.yaml" \
  -v $(pwd)/benchmarks/social_network/schemas:/app/schemas:ro \
  genezhang/clickgraph:latest

Or use docker-compose (uses published image by default):

docker-compose up -d

Option 2: Build from Source

# Prerequisites: Rust toolchain (1.85+) and Docker for ClickHouse

# 1. Clone and start ClickHouse
git clone https://github.com/genezhang/clickgraph
cd clickgraph
docker-compose up -d clickhouse-service

# 2. Build and run
cargo build --release
export CLICKHOUSE_URL="http://localhost:8123"
export CLICKHOUSE_USER="test_user"
export CLICKHOUSE_PASSWORD="test_pass"
export GRAPH_CONFIG_PATH="./benchmarks/social_network/schemas/social_benchmark.yaml"
cargo run --bin clickgraph

GRAPH_CONFIG_PATH is required. It tells ClickGraph how to map ClickHouse tables to graph nodes and edges.

Test Your Setup

# HTTP API
curl -X POST http://localhost:8080/query \
  -H "Content-Type: application/json" \
  -d '{"query": "MATCH (u:User) RETURN u.full_name LIMIT 5"}'

# Bolt protocol (cypher-shell, Neo4j Browser, or any Neo4j driver)
cypher-shell -a bolt://localhost:7687 -u neo4j -p password

Visualize with Neo4j Browser

Run the included demo for interactive graph visualization:

cd demos/neo4j-browser && bash setup.sh

Then open http://localhost:7474 and connect to bolt://localhost:7687. See demos/neo4j-browser/README.md for details.

AI Assistant Integration (MCP)

ClickGraph implements apoc.meta.schema() and Neo4j-compatible schema procedures, enabling AI assistants (Claude, etc.) to discover your graph structure via MCP servers like @anthropic-ai/mcp-server-neo4j and @neo4j/mcp-neo4j.

See the MCP Setup Guide for configuration details.

CLI Client

cargo build --release -p clickgraph-client
./target/release/clickgraph-client  # connects to http://localhost:8080

Schema Configuration

Map your tables to a graph with YAML:

views:
  - name: social_network
    nodes:
      - label: user
        table: users
        database: mydb
        node_id: user_id
        property_mappings:
          name: full_name
    edges:
      - type: follows
        table: user_follows
        database: mydb
        from_node: user
        to_node: user
        from_id: follower_id
        to_id: followed_id
MATCH (u:user)-[:follows]->(friend:user)
WHERE u.name = 'Alice'
RETURN friend.name

Documentation


Development Status

Current Version: v0.6.5-dev

Test Coverage

  • Rust Unit Tests: 1,588 passing (100%)
  • Integration Tests: 3,068 passing (108 environment-dependent)
  • LDBC SNB: 36/37 queries passing (97%)
  • Benchmarks: 14/14 passing (100%)
  • E2E Tests: Bolt 4/4, Cache 5/5 (100%)

Known Limitations

  • Read-Only Engine: Write operations not supported by design
  • Anonymous Nodes: Use named nodes for better SQL generation

See STATUS.md and KNOWN_ISSUES.md for details.

Roadmap

Phase Version Status
Phase 1 v0.4.0 Complete - Query cache, parameters, Bolt protocol
Phase 2 v0.5.0 Complete - Multi-tenancy, RBAC, auto-schema discovery
Phase 2.5-2.6 v0.5.2-v0.5.3 Complete - Schema variations, Cypher functions
Phase 3 v0.6.3 Complete - WITH redesign, GraphRAG, LDBC SNB, MCP
Phase 4 v0.6.x Next - user-requested features, advanced optimizations

See ROADMAP.md for detailed feature tracking.

Contributing

Contributions welcome! See DEV_QUICK_START.md to get started and DEVELOPMENT_PROCESS.md for the full workflow.

License

ClickGraph is licensed under the Apache License, Version 2.0. See the LICENSE file for details.

This project is developed on a forked repo of Brahmand with zero-ETL view-based graph querying, Neo4j ecosystem compatibility and enterprise deployment capabilities.

About

ClickGraph is an open-source graph query service layer for ClickHouse, written in Rust, turning an existing ClickHouse database into a graph view in minutes. It also provides an embedded mode with chdb, now supporting writes..

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Sponsor this project

  •  

Contributors

No contributors

Languages

  • Python 67.7%
  • Rust 29.8%
  • Shell 1.1%
  • Go 0.5%
  • PowerShell 0.4%
  • Cypher 0.3%
  • Other 0.2%