Add KB Arena (knowledge graph + hybrid retrieval benchmark)#28
Open
xmpuspus wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds KB Arena to the Open-source Project section.
KB Arena is an open-source benchmark that runs nine architecturally distinct retrieval strategies head-to-head on user-supplied corpora. The two GraphRAG-relevant strategies:
knowledge_graph— extraction-driven Neo4j graph using a universal 5-node-type / 7-rel-type schema (Topic, Component, Process, Config, Constraint + DEPENDS_ON, CONTAINS, CONNECTS_TO, TRIGGERS, CONFIGURES, ALTERNATIVE_TO, EXTENDS). Source provenance is stamped on every entity end-to-end so chunk-level retrieval matches against section ground truth.hybrid— RRF-fused vector + graph retrieval with three-stage intent routing (keyword scan → Haiku LLM → regex fallback). Domain-agnostic.Why it matters for the GraphRAG community: the field still lacks a clean apples-to-apples way to say "graph beats vector on this corpus by X with p<Y." KB Arena ships paired-bootstrap 95% CIs and Wilcoxon paired two-sided p-values on per-question IR metrics (Recall@k, NDCG, MAP, R-Precision, bpref), so GraphRAG vs vector vs hybrid comparisons report effect size with statistical confidence rather than mean-only deltas.
kb-arenaEntry appended to the existing Open-source Project list following the badge-prefix format used by the other 14 entries. No other sections modified.