test: LDBC SNB benchmark suite for embedded mode#249
Conversation
Self-contained benchmark loading sf0.003 Parquet data into chdb and running all LDBC queries (short, complex, BI). Files: - ldbc_snb_embedded.yaml: Schema with table_function:file() sources for all 14 node types and 31 edge types, including FK-edge subquery views and polymorphic Message unions - embedded_benchmark.py: Full runner for 41 queries with timing, --sql-only mode, --filter, --verbose options - embedded_benchmark_test.py: Pytest suite for short-1..7 with sql_only parametrized tests (no chdb needed) and chdb e2e tests (gated by CLICKGRAPH_CHDB_TESTS=1) No existing files modified. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
genezhang
left a comment
There was a problem hiding this comment.
Code Review: LDBC SNB Benchmark Suite for Embedded Mode
Impressive and thorough. 676-line schema mapping the full LDBC SNB graph (14 node types, 31 edge types) to Parquet files via table_function:file() and table_function:(SELECT ... UNION ALL ...) for polymorphic Message views. The benchmark runner and pytest suite are well-structured.
What's good
- Schema is comprehensive: Covers all LDBC entity types including polymorphic Message (Post + Comment UNION), FK-edge patterns (Person→City via LocationCityId), filtered label variants (City/Country/Continent from Place), and multi-from/to edge types (IS_LOCATED_IN for Person/Post/Comment/Organisation)
__DATA_DIR__placeholder pattern: Clean separation between schema structure and data location — same approach as the tutorial examples--sql-onlymode: Enables CI testing without chdb — the pytestTestSqlOnlyclass validates all 7 short queries translate to SQL- Graceful expected failures:
EXPECTED_FAILURESdict with reasons (bi-16 CALL subquery, etc.) — queries are skipped, not falsely reported as passing xfailin pytest: Message-related queries (short-2, 4, 5, 6, 7) correctly usepytest.xfailsince they depend on Message view support which may have issues- Benchmark runner is standalone: argparse with
--filter,--verbose,--timeout— works as both a development tool and CI check - Timing integration: Uses
result.compiling_timeandresult.execution_timefrom the new Kuzu parity API
Observations
1. Duplicated parameter substitution logic
substitute_params() in embedded_benchmark.py and _load_and_prepare() in embedded_benchmark_test.py have identical parameter substitution code (comment stripping + $param replacement). Consider extracting to a shared helper module or having the test import from the benchmark runner.
2. pytestmark overwrite
pytestmark = pytest.mark.skipif(...)
pytestmark = [pytestmark, pytest.mark.skipif(...)]The second line references the first pytestmark correctly (it becomes a list), but this pattern is fragile — it depends on the first assignment being evaluated before the second. A clearer approach:
pytestmark = [
pytest.mark.skipif(not CLICKGRAPH_AVAILABLE, reason="..."),
pytest.mark.skipif(os.environ.get(...) not in (...), reason="..."),
]3. TestSqlOnly class also gated behind CLICKGRAPH_CHDB_TESTS
The module-level pytestmark skips all tests when CLICKGRAPH_CHDB_TESTS is not set, including TestSqlOnly which explicitly doesn't need chdb. This means the sql_only tests won't run in CI unless the env var is set. Consider moving the chdb gate to TestInteractiveShort only, or overriding with pytestmark = [] on TestSqlOnly.
4. --timeout flag declared but not used
args.timeout is parsed but never applied — there's no signal.alarm() or Connection.set_query_timeout() call. The signal import at line 24 suggests this was planned. Either wire it up or remove the flag to avoid confusion.
5. Schema: duplicate table names for different from_node/to_node
Several IS_LOCATED_IN edges share the same table: Organisation_isLocatedIn_Place with different from_node/to_node pairs (Organisation→Place, Company→Country, University→City). This works because ClickGraph resolves by the composite key TYPE::From::To, but worth a comment in the schema noting this is intentional.
Verdict
LGTM with minor nits. The schema mapping is the star — it makes the full LDBC SNB benchmark runnable on Parquet files without any ClickHouse server. #3 (sql_only gating) is the most actionable issue since it affects CI usability.
1. Deduplicated param substitution: test imports substitute_params and load_query from embedded_benchmark.py instead of reimplementing 2. Cleaned up pytestmark: single skipif for clickgraph availability; chdb gate moved to TestInteractiveShort class only 3. TestSqlOnly now runs without CLICKGRAPH_CHDB_TESTS — enables CI testing of Cypher→SQL translation 4. Removed unused --timeout flag and signal import 5. Added comment about intentional duplicate table names for IS_LOCATED_IN edges (Organisation/Company/University variants) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Self-contained LDBC SNB benchmark for embedded mode using sf0.003 Parquet data (1.4 MB, already in-repo).
Files
ldbc_snb_embedded.yamltable_function:file()Parquet sources for 14 nodes + 31 edgesembedded_benchmark.py--sql-only/--filter/--verboseoptionsembedded_benchmark_test.pyUsage
No existing files modified.
🤖 Generated with Claude Code