perf(graph-analytics): make C++ faster than Python by swapping slow maps for contiguous vectors by Phantom0299 · Pull Request #42 · iiitl/chuck

Phantom0299 · 2026-04-12T06:11:05Z

Title: perf: Rewrite C++ backend to finally beat the Python/NumPy baseline (#38)

It was a bit frustrating to see that the C++ backend was actually losing to the Python version in the benchmarks. After looking at the binding.cpp code, it was clear why the core pagerank loop was bogged down by std::map lookups and string comparisons for every single edge. Basically, the C++ code was doing way more work than it needed to, while NumPy's C-kernels were running highly optimized contiguous memory operations.

How I made it faster than Python-
I upgraded the backend to use a data oriented approach, focusing on how the cpu actually likes to read data:

Integer Mapping: I stopped using strings in the hot loop. I mapped all node names to integer IDs once at the start so the cpu can just work with numbers.
Straight-Line Memory: I swapped the slow std::map for a standard std::vector. Instead of making the CPU jump all over the place to find the next rank, this packs all the data tightly together in one continuous line. The CPU can just zoom straight through it without pausing.
Simpler Connections: I flattened the graph structure into basic lists of integer IDs. By getting rid of the bulky data structures, we cut out all the hidden background work that was dragging the C++ speed down.
The Results
I ran the benchmarks against both the original C++ code and the Python baseline. The C++ version is now over 3x faster than it was and has successfully overtaken the Python/NumPy implementation.

Original C++: 0.027044s (Slower than Python)
Python (NumPy): 0.008787s (Baseline)
Optimized C++ (My Fix): 0.008621s (Faster than Python)

Total Speedup: ~3.14x compared to the previous C++ version.

Fixes #38

Summary by CodeRabbit

Refactor
- Internal graph processing and PageRank implementation rewritten for significantly lower memory use and faster computation on large graphs; numerical results, outputs, and the public API remain unchanged.
- Performance improvements should be transparent to users—same behavior and results, with reduced resource usage and quicker processing for large inputs.

…aps for contiguous vectors

coderabbitai · 2026-04-12T06:11:22Z

Walkthrough

Replaces string-keyed PageRank state with index-based structures: name_to_idx (std::unordered_map), adjacency lists adj (std::vector<std::vector<int>>), and rank vectors; converts neighbor names to indices, updates iterative rank computation (including explicit dangling mass redistribution), and updates top-node selection and checksum; public signatures unchanged.

Changes

Cohort / File(s)	Summary
PageRank C++ implementation `chuck/tasks/graph_analytics/native_cpp/binding.cpp`	Replaced `map<string,...>` state with index-based containers (`name_to_idx`, `adj`, `rank`, `new_rank`); converted neighbor names to integer indices; dangling nodes are represented by empty `adj[u]` and their mass is computed as `dangling_share` and redistributed each iteration; iterative update now zeroes `new_rank` via `std::fill(base)`, accumulates contributions with `new_rank[v] += (dampingrank[u])/adj[u].size()`, and adds redistributed dangling/base mass; top-node selection and checksum now operate on indices (`top_idx`, checksum sums `(i+1)rank[i]`); exported function signatures (`py::dict solve(py::object payload_obj)`, module init) unchanged.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: replacing slow string-keyed maps with index-based contiguous vectors for performance optimization in the C++ graph analytics backend.
Linked Issues check	✅ Passed	The PR successfully addresses issue `#38`'s core objective by replacing inefficient map<string, vector> with index-based structures and contiguous vectors, achieving reported speedup from 0.027s to 0.008621s, exceeding the Python baseline of 0.008787s.
Out of Scope Changes check	✅ Passed	All changes are scoped to the C++ PageRank implementation in binding.cpp, directly addressing the performance optimization objective without unrelated modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

chuck/tasks/graph_analytics/native_cpp/binding.cpp (2)
73-73: Consider std::swap instead of copy assignment for better performance.

rank = new_rank performs a full vector copy (O(n)). Since this is a performance-focused PR, consider using std::swap(rank, new_rank) for O(1) exchange. The values in new_rank are overwritten by std::fill at the start of each iteration anyway.
Suggested optimization
-        rank = new_rank; 
+        std::swap(rank, new_rank);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@chuck/tasks/graph_analytics/native_cpp/binding.cpp` at line 73, The
assignment "rank = new_rank" causes an O(n) vector copy; replace it with an O(1
exchange using std::swap(rank, new_rank) in the same scope (where rank and
new_rank are used, and new_rank is later overwritten by std::fill). Ensure
<utility> (or <algorithm>) is included if not already so std::swap is available;
keep the rest of the iteration logic unchanged.
33-33: Narrowing conversion from size_t to int.

nodes.size() returns std::size_t. Assigning to int is a narrowing conversion that could truncate on graphs with more than ~2 billion nodes. While unlikely in practice, consider using std::size_t or int64_t for correctness, or add an explicit cast to silence compiler warnings.
Suggested fix
-    int n = nodes.size();
+    int n = static_cast<int>(nodes.size());
Or for large graph safety:
-    int n = nodes.size();
+    std::size_t n = nodes.size();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@chuck/tasks/graph_analytics/native_cpp/binding.cpp` at line 33, The
assignment int n = nodes.size() narrows std::size_t into int; change the type of
n to std::size_t (or int64_t if you prefer signed) or add an explicit cast if
you intentionally want truncation so it no longer performs an implicit narrowing
conversion; update the declaration of n in binding.cpp (the variable initialized
from nodes.size()) accordingly and ensure any downstream uses handle the new
type.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@chuck/tasks/graph_analytics/native_cpp/binding.cpp`:
- Around line 14-16: Remove the trailing whitespace characters present in the
file around the graph deserialization line where graph is assigned
(std::map<std::string, std::vector<std::string>> graph =
payload_obj.cast<std::map<std::string, std::vector<std::string>>>()), and also
strip trailing spaces from the nearby lines referenced in the review (other
surrounding lines in the same function/block). Ensure no lines end with spaces
so the compiler/CI whitespace checks pass.
- Around line 50-53: The loop that builds adjacency uses name_to_idx[target]
which silently inserts a default entry for unknown targets; change it to use
name_to_idx.at(target) so a std::out_of_range is thrown for malformed input (in
the block referencing adj, found, target, and name_to_idx) and thus fail fast on
edges to non-existent nodes—update the push_back call to use .at(target) and
handle or let the exception propagate, or add validation that all targets exist
before populating adj.

---

Nitpick comments:
In `@chuck/tasks/graph_analytics/native_cpp/binding.cpp`:
- Line 73: The assignment "rank = new_rank" causes an O(n) vector copy; replace
it with an O(1 exchange using std::swap(rank, new_rank) in the same scope (where
rank and new_rank are used, and new_rank is later overwritten by std::fill).
Ensure <utility> (or <algorithm>) is included if not already so std::swap is
available; keep the rest of the iteration logic unchanged.
- Line 33: The assignment int n = nodes.size() narrows std::size_t into int;
change the type of n to std::size_t (or int64_t if you prefer signed) or add an
explicit cast if you intentionally want truncation so it no longer performs an
implicit narrowing conversion; update the declaration of n in binding.cpp (the
variable initialized from nodes.size()) accordingly and ensure any downstream
uses handle the new type.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 58445ec1-1a26-4ef8-a99a-33bb65e865c2

📥 Commits

Reviewing files that changed from the base of the PR and between a302e5f and 41a01f5.

📒 Files selected for processing (1)

chuck/tasks/graph_analytics/native_cpp/binding.cpp

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

chuck/tasks/graph_analytics/native_cpp/binding.cpp (1)

47-48: ⚠️ Potential issue | 🟠 Major

Fail fast on unknown edge targets.

Line 48 uses operator[], so a missing target is inserted with index 0 and the solver returns a corrupted result instead of throwing. That also bypasses the fallback path in chuck/native_bindings.py, which only retries Python when the native solver raises. Use .at(target) or validate the payload before building adj.

Suggested fix

-            for (const auto& target : found->second) {
-                adj[u].push_back(name_to_idx[target]);
-            }
+            for (const auto& target : found->second) {
+                adj[u].push_back(name_to_idx.at(target));
+            }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@chuck/tasks/graph_analytics/native_cpp/binding.cpp` around lines 47 - 48, The
loop that builds adj uses name_to_idx[target] which inserts a default 0 when
target is missing; instead validate and fail fast so the native solver throws
and triggers the Python fallback: in the loop over found->second (symbols:
found, target, adj, name_to_idx) either replace the operator[] access with
name_to_idx.at(target) or explicitly check name_to_idx.find(target) and if not
found throw a std::runtime_error (with a clear message including target) before
calling adj[u].push_back(...).

🧹 Nitpick comments (1)

chuck/tasks/graph_analytics/native_cpp/binding.cpp (1)
57-69: Swap the rank buffers instead of copying them.

rank = new_rank; copies n doubles every iteration. swap keeps the behavior the same here because new_rank is fully reset with std::fill at the top of the next pass.
Suggested fix
-        rank = new_rank;
+        rank.swap(new_rank);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@chuck/tasks/graph_analytics/native_cpp/binding.cpp` around lines 57 - 69, The
loop currently does an O(n) copy each iteration via "rank = new_rank;"; replace
that assignment with a constant-time buffer swap (e.g., call rank.swap(new_rank)
or std::swap(rank, new_rank)) at the end of the iteration so you reuse the
preallocated vectors and keep the same behavior given you reset new_rank with
std::fill at the top of the next pass; update the assignment in the loop body
where variables rank and new_rank are used.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@chuck/tasks/graph_analytics/native_cpp/binding.cpp`:
- Around line 41-49: The code currently materializes a full adjacency list by
calling adj[u].resize(n) and filling adj[u][v]=v when graph.find(nodes[u]) is
missing or empty; this wastes O(n²) memory for many dangling nodes. Instead,
when found == graph.end() || found->second.empty() leave adj[u] empty (do not
resize or push all v), and update the PageRank iteration logic to account for
dangling nodes by computing a single dangling_mass (sum of ranks of nodes with
empty adj) each iteration and distributing dangling_mass/n to all nodes during
rank update; apply this change at both places handling dangling branches (the
block with adj[u].resize(n) / adj[u][v]=v and the analogous block around lines
63-66) and keep use of name_to_idx and other mappings unchanged.

---

Duplicate comments:
In `@chuck/tasks/graph_analytics/native_cpp/binding.cpp`:
- Around line 47-48: The loop that builds adj uses name_to_idx[target] which
inserts a default 0 when target is missing; instead validate and fail fast so
the native solver throws and triggers the Python fallback: in the loop over
found->second (symbols: found, target, adj, name_to_idx) either replace the
operator[] access with name_to_idx.at(target) or explicitly check
name_to_idx.find(target) and if not found throw a std::runtime_error (with a
clear message including target) before calling adj[u].push_back(...).

---

Nitpick comments:
In `@chuck/tasks/graph_analytics/native_cpp/binding.cpp`:
- Around line 57-69: The loop currently does an O(n) copy each iteration via
"rank = new_rank;"; replace that assignment with a constant-time buffer swap
(e.g., call rank.swap(new_rank) or std::swap(rank, new_rank)) at the end of the
iteration so you reuse the preallocated vectors and keep the same behavior given
you reset new_rank with std::fill at the top of the next pass; update the
assignment in the loop body where variables rank and new_rank are used.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f2382e96-56dc-48d9-90dd-6bb46d6e05c3

📥 Commits

Reviewing files that changed from the base of the PR and between 41a01f5 and 46d849e.

📒 Files selected for processing (1)

chuck/tasks/graph_analytics/native_cpp/binding.cpp

Aaryan-Dadu · 2026-04-12T08:42:05Z

 #include <map>
 #include <string>
 #include <vector>
+#include <unordered_map>


You are using unordered_map, how will you avoid hash collisions?

Aaryan-Dadu · 2026-04-12T08:43:17Z

+    std::unordered_map<std::string, int> name_to_idx;
+    for (int i = 0; i < n; ++i) {
+        name_to_idx[nodes[i]] = i;
+    }


Please share the A/B comparison workflow for the changes

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@chuck/tasks/graph_analytics/native_cpp/binding.cpp`:
- Around line 84-86: The tie-break in the top-node selection loop uses an
epsilon check allowing near-equal scores to fall back to node ordering; update
the condition in the loop that iterates over rank[] (the for loop using rank,
top_score, nodes, top_idx) to only use exact equality for the tie case (i.e.,
replace the fabs(rank[i] - top_score) < 1e-15 check with a direct equality
comparison rank[i] == top_score) so that node-name ordering is applied only when
scores are exactly equal, matching the Python implementation.
- Around line 34-36: The comment about unordered_map::reserve on the name_to_idx
local is misleading; change it to state that reserve(n) pre-allocates buckets to
reduce rehashing and keep the average load factor lower (improving performance)
but does not prevent hash collisions, which are determined by the hash function;
update the comment near name_to_idx and the reserve call to reflect this
accurate behavior.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ff7e5089-da79-4d38-bcdc-6e770019837a

📥 Commits

Reviewing files that changed from the base of the PR and between 46d849e and 48ba0e7.

📒 Files selected for processing (1)

chuck/tasks/graph_analytics/native_cpp/binding.cpp

coderabbitai · 2026-04-12T10:17:22Z

+    // Fix: .reserve(n) ensures the load factor stays low, preventing hash collisions
+    std::unordered_map<std::string, int> name_to_idx;
+    name_to_idx.reserve(n);


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

In C++, does std::unordered_map::reserve prevent hash collisions, or does it only reduce rehashes and average bucket occupancy?

💡 Result:

std::unordered_map::reserve does not prevent hash collisions. It only reduces rehashes and average bucket occupancy (i.e., lowers the load factor). Hash collisions occur when distinct keys have the same hash value, determined solely by the hash function. Collisions are handled by chaining (linked lists per bucket) regardless of the number of buckets. reserve cannot change hash values or prevent collisions; it only allocates more buckets. reserve(n) sets bucket_count to at least ceil(n / max_load_factor), ensuring at least n elements can be inserted without exceeding max_load_factor and triggering automatic rehash. This pre-allocates buckets to avoid incremental rehashes during insertion and keeps average elements per bucket low, improving performance by reducing chain lengths for collided elements. No source indicates reserve affects collision occurrence; it manages bucket count for load factor control.

Citations:

1: https://en.cppreference.com/w/cpp/container/unordered_map/reserve.html

2: https://stackoverflow.com/questions/42373232/why-does-stdunordered-map-have-a-reserve-method

3: https://stackoverflow.com/questions/31098123/c-unordered-map-collision-handling-resize-and-rehash

4: https://en.cppreference.com/w/cpp/container/unordered_map.html

Correct the misleading comment on unordered_map::reserve.

.reserve(n) reduces rehashing and keeps average bucket occupancy low, but does not prevent hash collisions. Collisions are determined by the hash function and cannot be prevented by adjusting bucket count. Update the comment to reflect what the method actually guarantees.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@chuck/tasks/graph_analytics/native_cpp/binding.cpp` around lines 34 - 36, The comment about unordered_map::reserve on the name_to_idx local is misleading; change it to state that reserve(n) pre-allocates buckets to reduce rehashing and keep the average load factor lower (improving performance) but does not prevent hash collisions, which are determined by the hash function; update the comment near name_to_idx and the reserve call to reflect this accurate behavior.

coderabbitai · 2026-04-12T10:17:22Z

+    for (int i = 1; i < n; ++i) {
+        if (rank[i] > top_score || (std::abs(rank[i] - top_score) < 1e-15 && nodes[i] > nodes[top_idx])) {
+            top_score = rank[i];


⚠️ Potential issue | 🟡 Minor

Remove the epsilon from the top-node tie-break.

The Python implementation only falls back to node-name ordering when the scores are exactly equal. With std::abs(rank[i] - top_score) < 1e-15, a slightly smaller score can still win here, so top_node can diverge from chuck/tasks/graph_analytics/task.py.

💡 Suggested fix

- if (rank[i] > top_score || (std::abs(rank[i] - top_score) < 1e-15 && nodes[i] > nodes[top_idx])) { + if (rank[i] > top_score || (rank[i] == top_score && nodes[i] > nodes[top_idx])) {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

for (int i = 1; i < n; ++i) {

if (rank[i] > top_score || (std::abs(rank[i] - top_score) < 1e-15 && nodes[i] > nodes[top_idx])) {

top_score = rank[i];

for (int i = 1; i < n; ++i) {

if (rank[i] > top_score || (rank[i] == top_score && nodes[i] > nodes[top_idx])) {

top_score = rank[i];

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@chuck/tasks/graph_analytics/native_cpp/binding.cpp` around lines 84 - 86, The tie-break in the top-node selection loop uses an epsilon check allowing near-equal scores to fall back to node ordering; update the condition in the loop that iterates over rank[] (the for loop using rank, top_score, nodes, top_idx) to only use exact equality for the tie case (i.e., replace the fabs(rank[i] - top_score) < 1e-15 check with a direct equality comparison rank[i] == top_score) so that node-name ordering is applied only when scores are exactly equal, matching the Python implementation.

perf(graph-analytics): make C++ faster than Python by swapping slow m…

41a01f5

…aps for contiguous vectors

mini-walkerx Bot mentioned this pull request Apr 12, 2026

graph-analytics C++ backend is slower than Python backend #38

Open

style: remove whitespace to fix CI

46d849e

Phantom0299 force-pushed the perf/cpp-graph-analytics branch from 59ba1b8 to 46d849e Compare April 12, 2026 06:16

coderabbitai Bot reviewed Apr 12, 2026

View reviewed changes

Comment thread chuck/tasks/graph_analytics/native_cpp/binding.cpp Outdated

Comment thread chuck/tasks/graph_analytics/native_cpp/binding.cpp

coderabbitai Bot reviewed Apr 12, 2026

View reviewed changes

Comment thread chuck/tasks/graph_analytics/native_cpp/binding.cpp

Phantom0299 mentioned this pull request Apr 12, 2026

perf: optimize pagerank using sparse vectorization and np.add.at #40

Draft

Aaryan-Dadu requested changes Apr 12, 2026

View reviewed changes

perf: use reserved unordered_map for O(1) lookups and fix collisions

48ba0e7

coderabbitai Bot reviewed Apr 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(graph-analytics): make C++ faster than Python by swapping slow maps for contiguous vectors#42

perf(graph-analytics): make C++ faster than Python by swapping slow maps for contiguous vectors#42
Phantom0299 wants to merge 3 commits into
iiitl:mainfrom
Phantom0299:perf/cpp-graph-analytics

Phantom0299 commented Apr 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 12, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Aaryan-Dadu Apr 12, 2026 •

edited

Loading

Uh oh!

Aaryan-Dadu Apr 12, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Uh oh!

coderabbitai Bot Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Phantom0299 commented Apr 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Aaryan-Dadu Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Aaryan-Dadu Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Phantom0299 commented Apr 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 12, 2026 •

edited

Loading

Aaryan-Dadu Apr 12, 2026 •

edited

Loading