sandx-graph

Graph intelligence engine — knowledge graph construction, neighborhood consensus, semantic linkage.

Part of the SandX Lab computational infrastructure ecosystem.

What It Does

sandx-graph is the graph reasoning layer that operates downstream of sandx-er. It constructs knowledge graphs from resolved entity clusters and computes neighborhood consensus — a measure of how strongly each node's local neighborhood agrees.

sandx-er clusters  →  GraphBuilder  →  KnowledgeGraph  →  ConsensusEngine  →  consensus scores

Status

v0.1 — Working

Component	Status
`GraphBuilder` — construct graphs from clusters, DataFrames, similarity matrices	Working
`KnowledgeGraph` — undirected weighted graph with adjacency traversal	Working
`ConsensusEngine` — BFS neighborhood consensus computation	Working
NetworkX export	Working (optional dep)
PyPI package	Working

Installation

pip install sandx-graph

Or from source:

git clone https://github.com/sandxlab/sandx-graph
cd sandx-graph
pip install -e ".[dev]"

For NetworkX export:

pip install "sandx-graph[networkx]"

Demo

pip install sandx-graph
python -m examples.graph_consensus

Constructs a 5-node knowledge graph of tech companies, scores neighborhood consensus, and prints the weighted edge list — no external data required.

Quick Start

From sandx-er resolution output

import pandas as pd
from sandx_er import EntityResolver
from sandx_graph import GraphBuilder, ConsensusEngine

# Resolve records into entity clusters
records = pd.DataFrame({
    "name": ["Acme Corp", "Acme Corp.", "GlobalTech Inc", "Global Tech"],
    "city": ["Boston", "Boston", "New York", "New York"],
})
er = EntityResolver(blocking="lsh", similarity="jaccard", threshold=0.4)
result = er.resolve(records)

# Build knowledge graph from resolved clusters
builder = GraphBuilder()
graph = builder.from_clusters(result.clusters)
print(graph)  # KnowledgeGraph(n_nodes=2, n_edges=0)

# Add relationship edges (here via similarity matrix)
import numpy as np
ids = [c.canonical_id for c in result.clusters]
sim = np.array([[1.0, 0.3], [0.3, 1.0]])
graph = builder.from_similarity_matrix(ids, sim, threshold=0.5)

From DataFrames

import pandas as pd
from sandx_graph import GraphBuilder, ConsensusEngine

nodes_df = pd.DataFrame({"node_id": ["e1", "e2", "e3"], "label": ["Acme", "GlobalTech", "Initech"]})
edges_df = pd.DataFrame({"source": ["e1", "e2"], "target": ["e2", "e3"], "weight": [0.85, 0.62]})

builder = GraphBuilder()
graph = builder.from_dataframe(nodes_df, edges_df)

# Compute neighborhood consensus
engine = ConsensusEngine(graph)
score = engine.compute("e1", depth=2)
print(score)
# ConsensusScore(node='e1', score=0.735, support=2, conflict=0)

# Batch over all nodes
all_scores = engine.compute_all(depth=1)
stats = engine.summary(depth=1)
print(stats)
# {'mean': 0.735, 'median': 0.735, 'std': 0.115, 'min': 0.620, 'max': 0.850}

Consensus Score

ConsensusEngine runs BFS from a node up to a given depth, collecting all edge weights encountered. The consensus score is the weighted mean of those edges.

Score	Interpretation
→ 1.0	Node connected to high-confidence, strongly agreeing neighbors
→ 0.5	Mixed neighborhood — some support, some conflict
→ 0.0	Weak or conflicting edges throughout the neighborhood

Isolated nodes (degree 0) return score 1.0 by convention.

API Reference

`GraphBuilder`

Method	Description
`from_clusters(clusters)`	One node per `sandx-er` EntityCluster; no edges
`from_dataframe(nodes_df, edges_df, ...)`	Build from node/edge DataFrames
`from_similarity_matrix(ids, similarity, threshold)`	Build from pairwise similarity matrix

`KnowledgeGraph`

Attribute / Method	Description
`n_nodes`, `n_edges`	Graph size
`nodes`	Dict of node_id → attribute dict
`edges`	List of (source, target, weight) triples
`neighbors(node_id)`	Adjacent node IDs
`neighbors_weighted(node_id)`	(neighbor_id, weight) pairs
`degree(node_id)`	Number of incident edges
`has_node(node_id)`, `has_edge(a, b)`	Membership checks
`to_dataframe()`	Edge list as pandas DataFrame
`to_networkx()`	Export to NetworkX Graph

`ConsensusEngine`

Method	Description
`compute(node_id, depth=2)`	Consensus score for one node
`compute_all(depth=2)`	Scores for all nodes
`summary(depth=1)`	Mean/median/std/min/max over all nodes

License

Apache 2.0 — see LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github		.github
docs		docs
src		src
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sandx-graph

What It Does

Status

Installation

Demo

Quick Start

From sandx-er resolution output

From DataFrames

Consensus Score

API Reference

`GraphBuilder`

`KnowledgeGraph`

`ConsensusEngine`

Related

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

sandx-graph

What It Does

Status

Installation

Demo

Quick Start

From sandx-er resolution output

From DataFrames

Consensus Score

API Reference

GraphBuilder

KnowledgeGraph

ConsensusEngine

Related

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GraphBuilder`

`KnowledgeGraph`

`ConsensusEngine`

Packages