Interactive dependency graph explorer for the Mathlib4 mathematical library.
An open-source reproduction of Crispher/MathlibExplorer, powered by leanprover-community/mathlib4 as the upstream data source. The project keeps the strongest ideas from the original tool: dependency-aware horizontal layout, topic grouping, zoom/pan navigation, node highlighting, and precomputed graph metrics.
The main full-data path uses a local leanprover-community/mathlib4 source
snapshot and statically parses Mathlib/**/*.lean import declarations. The
Hugging Face dataset phanerozoic/Lean4-Mathlib remains available as a theorem
and symbol-level supplement, and the sample fixture keeps tests runnable offline.
Run the complete offline demo:
make allThis uses data/sample/lean4_mathlib_sample.jsonl and produces:
data/processed/nodes.csvdata/processed/edges.csvdata/processed/graph.jsondata/processed/metrics.jsondocs/index.html
Run the full Mathlib module import network:
make mathlib-source
make data DATA_SOURCE=mathlib-source
make webWhen built from the local Mathlib source snapshot, selected modules include an
Open Mathlib source link that opens the corresponding .lean file in the
upstream Mathlib4 GitHub repository at the exact source commit recorded in the
generated graph.
To use the Hugging Face dataset when network access is available:
make data DATA_SOURCE=hf
make webFor a faster real-data smoke test, cap the Dataset Viewer fallback:
make data DATA_SOURCE=hf HF_ROW_LIMIT=1000
make webThe current local pyarrow build may fail to read Hugging Face's generated
Parquet shard with a repetition-level error. The downloader records the Parquet
metadata, then automatically falls back to the Hugging Face Dataset Viewer rows
API. HF_ROW_LIMIT=0 means "fetch all rows" through that fallback if needed;
large full-dataset pulls can take a while because the API pages at 100 rows.
If you already have a local Parquet/CSV/JSONL export:
make data DATA_SOURCE=file RAW_INPUT=/absolute/path/to/export.parquetdocs/index.html is the dependency-light interactive demo. The sidebar can be
collapsed with the round button in the graph corner when you want a wider
overview.
Import direction determines horizontal position: modules further right have deeper dependency chains. Topic/namespace lanes separate major mathematical areas vertically, while a weak force refinement keeps local neighborhoods from collapsing into identical rows.
Click a node to highlight its direct neighbors, transitive dependencies, and transitive dependents:
- Direct import edges are shown in blue; direct reverse-dependency edges in green.
- Transitive relations remain visible with a weaker color.
- Unrelated nodes and edges are faded.
The default edge view shows the transitive-reduction backbone. The raw import mode restores all source import edges, and selected-only mode focuses on the currently selected module's local flow.
detailed node information and dependency view.
Lie group node search and local neighborhood exploration.
scripts/: data extraction and graph pipelinescripts/mathlib_graph/: reusable graph pipeline packageweb/: TypeScript + Vite + Sigma.js/Graphology frontend scaffolddocs/: static deliverable and project notesdata/sample/: offline fixturedata/raw/: leanprover-community/mathlib4 source snapshotdata/processed/: generated graph datatests/: offline unit tests
make mathlib-source # Download a mathlib4 source snapshot
make data # Build nodes.csv, edges.csv, graph.json, metrics.json
make web # Build static interactive page in docs/index.html
make web-vite # Build the long-term Vite/Sigma frontend when npm deps are available
make test # Run offline tests
make all # Full artifact pipelineUse make mathlib-source MATHLIB_FORCE=1 to force-refresh the cached Mathlib
source snapshot.
The stable generated files are:
nodes.csv: one module per row, with centrality, community, depth, layout, and topic fields.edges.csv: one import relation per row,source imports target.graph.json: frontend-ready graph payload; local Mathlib source builds also includesourceFile,sourceUri, andsourceRefon nodes.metrics.json: counts, validation checks, and top analytic results.



