feat(recall): add min_score threshold and adaptive floor filtering (#73) by jack-arturo · Pull Request #101 · verygoodplugins/automem

jack-arturo · 2026-03-02T12:17:04Z

Summary

Adds min_score query parameter to /recall for explicit noise filtering
Adds RECALL_MIN_SCORE env var (default 0.0) as server-wide default
Adds RECALL_ADAPTIVE_FLOOR (default true) which detects steep score dropoffs and prunes low-quality result tails automatically
Includes score_filter metadata in response when filtering is active

Benchmark Results

Metric	Baseline	This PR	Delta
LoCoMo-mini (2 convos)	76.97%	76.97%	+0.0%

No regression. The min_score infrastructure is ready but its impact depends on #78 (decay fix) to improve score differentiation. Current scores cluster tightly (0.35-0.39 range) so threshold filtering has limited effect until that's addressed.

Health Check

Score distribution remains DEGRADED (low variance) — expected, as this PR adds filtering infrastructure, not scoring improvements.

Test Plan

Unit tests pass (150/150, 2 pre-existing env-related failures)
LoCoMo-mini benchmark: no regression
Direct API testing: min_score=0.5 correctly filters all results below threshold
Adaptive floor: tested with diverse queries, correctly detects steep dropoffs

Closes #73

Made with Cursor

- Add scripts/bench/health_check.py for recall health diagnostics (score distribution, entity quality, cross-query overlap, latency) - Fix datetime tz-awareness bug in LoCoMo date matching - Fix restore_and_eval.sh to produce .json result files - Add bench-health Makefile target - Record Voyage 4 baselines: locomo-mini 76.97%, locomo-full 80.06% - Create benchmarks/EXPERIMENT_LOG.md for tracking experiments Made-with: Cursor

- Add RECALL_MIN_SCORE config (default 0.0) and min_score query param - Add RECALL_ADAPTIVE_FLOOR config (default true) for automatic steep dropoff detection that prunes low-quality result tails - Apply filtering at both per-query and final assembly stages - Include score_filter metadata in response when active - No benchmark regression: locomo-mini 76.97% (unchanged) The adaptive floor detects gaps >15% of max score in the top half of results and cuts below the dropoff. Combined with min_score param, clients can now control noise filtering. Full impact expected after decay fix (#78) improves score differentiation. Made-with: Cursor

Made-with: Cursor

coderabbitai · 2026-03-02T12:17:25Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7b43f24 and fcbe5c5.

📒 Files selected for processing (3)

benchmarks/EXPERIMENT_LOG.md
scripts/bench/health_check.py
tests/benchmarks/test_locomo.py

🚧 Files skipped from review as they are similar to previous changes (3)

tests/benchmarks/test_locomo.py
benchmarks/EXPERIMENT_LOG.md
scripts/bench/health_check.py

📝 Walkthrough

Summary by CodeRabbit

Release Notes

New Features
- Added minimum score filtering and adaptive floor scoring logic to recall results for improved result quality control.
Bug Fixes
- Fixed date comparison logic in benchmark tests with UTC normalization.
- Improved percentile calculation robustness in health checks.
Chores
- Added GitHub Actions workflow for documentation synchronization.
- Updated .gitignore to exclude benchmark artifacts.
Documentation
- Updated benchmark experiment log with revised test configuration.

Walkthrough

Implements minimum relevance score filtering and adaptive score floor detection for the recall API to eliminate low-quality results. Adds configuration constants, updates the recall pipeline to apply filtering during per-query and final result assembly, adds a GitHub Actions workflow for documentation updates, and updates benchmark infrastructure to support testing these changes.

Changes

Cohort / File(s)	Summary
Recall Filtering Implementation `automem/config.py`, `automem/api/recall.py`	Added `RECALL_MIN_SCORE` and `RECALL_ADAPTIVE_FLOOR` configuration constants. Implemented min_score parameter parsing and adaptive floor logic in recall pipeline, applying filtering during per-query processing and final result assembly; exposed score_filter metadata in responses.
Benchmark Infrastructure `benchmarks/EXPERIMENT_LOG.md`, `scripts/bench/health_check.py`, `tests/benchmarks/test_locomo.py`, `.gitignore`	Updated locomo-mini benchmark test data (2 convos, 304 Qs). Refined percentile calculation and conditionals in health check. Fixed date comparison logic in test to use UTC-normalized timestamps. Added `/benchmarks/results/` and `benchmarks/baselines/locomo_baseline.json` to gitignore.
CI/CD & Documentation Dispatch `.github/workflows/docs-dispatch.yml`	Added new GitHub Actions workflow to monitor changes on main, determine affected documentation via file-mapping, and dispatch repository_dispatch event to verygoodplugins/automem-website with payload containing source repo, SHA, changed files, and comparison URLs.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

feat(api): add time-based sorting to recall API #31: Both PRs modify the recall API pipeline by adding new filterable parameters and processing steps that thread through the same query and result assembly flows.
docs(bench): add PR #73, #80, and #87 experiment results #103: Documents the benchmark experiment results that validate the min_score and adaptive_floor filtering logic implemented in this PR.
chore(bench): add health check, baselines, and fix eval harness #102: Overlaps in benchmark infrastructure modifications, updating the same health_check.py, test_locomo.py, and .gitignore files.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main features added: min_score threshold and adaptive floor filtering for recall, matching the core changes in the PR.
Description check	✅ Passed	The description is well-related to the changeset, covering the added min_score parameter, environment variables, adaptive floor logic, benchmark results, and test coverage.
Linked Issues check	✅ Passed	The PR comprehensively addresses issue `#73`: implements min_score parameter with env var default, adaptive floor detection for steep dropoffs, filtering in recall assembly, and score_filter metadata in responses.
Out of Scope Changes check	✅ Passed	Changes to .github/workflows/docs-dispatch.yml, .gitignore, benchmark logs, and health check utilities are tangential but reasonable supporting updates; core changes directly address `#73` requirements.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch exp/73-min-score-threshold

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

jack-arturo · 2026-03-02T16:01:50Z

@coderabbitai review

coderabbitai · 2026-03-02T16:02:04Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

automem/api/recall.py (1)

1450-1515: ⚠️ Potential issue | 🟠 Major

score_filter.filtered_count underreports filtering when adaptive floor is active.

The response currently reports filtered_count=0 for adaptive-floor-only filtering, and undercounts when both filters apply. This makes API metadata inaccurate.

🛠️ Suggested fix

-    # Apply explicit min_score on final assembled results (catches expansions)
-    pre_filter_count = len(results)
+    # Apply explicit min_score on final assembled results (catches expansions)
+    adaptive_filtered_count = 0
+    min_score_filtered_count = 0
+    pre_min_score_count = len(results)
     if min_score is not None and min_score > 0:
         results = [r for r in results if float(r.get("final_score", 0.0)) >= min_score]
+        min_score_filtered_count = pre_min_score_count - len(results)
@@
-    if min_score or score_floor_applied:
+    # If adaptive floor was applied earlier, capture its impact too.
+    if score_floor_applied is not None:
+        # Recompute relative to the list before min_score filtering.
+        adaptive_filtered_count = max(0, pre_min_score_count - len(results) - min_score_filtered_count)
+
+    if (min_score is not None and min_score > 0) or score_floor_applied is not None:
         response["score_filter"] = {
             "min_score": min_score,
             "adaptive_floor": score_floor_applied,
-            "filtered_count": pre_filter_count - len(results) if min_score else 0,
+            "filtered_count": adaptive_filtered_count + min_score_filtered_count,
+            "adaptive_filtered_count": adaptive_filtered_count,
+            "min_score_filtered_count": min_score_filtered_count,
         }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@automem/api/recall.py` around lines 1450 - 1515, The response's
score_filter.filtered_count currently sets filtered_count to 0 when only the
adaptive floor (score_floor_applied) removed results because it uses "if
min_score else 0"; update the logic so filtered_count always equals
pre_filter_count - len(results) whenever either min_score or score_floor_applied
is true. In other words, in recall.py where the response["score_filter"] is
built (symbols: score_filter, min_score, score_floor_applied, pre_filter_count,
results), replace the conditional expression for filtered_count with
filtered_count = pre_filter_count - len(results) so adaptive-floor-only and
combined filters are reported correctly.

🧹 Nitpick comments (3)

scripts/bench/health_check.py (3)

212-213: Silent exception swallowing hides failures.

As flagged by static analysis, the try-except-continue pattern silently drops errors without any indication. For a health check tool, logging or at least counting these failures would improve observability.

📝 Proposed fix to track failed queries

+        except Exception as e:
+            # Track failure but continue checking other queries
+            query_results[query[:40]] = []  # Empty result indicates failure
             continue
-        except Exception:
-            continue

Or add a counter for failed queries in the return value.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/bench/health_check.py` around lines 212 - 213, The except Exception:
continue in health_check.py silently swallows errors — update the error handling
in the try/except block (the block catching Exception) to record failures
instead of just continuing: increment a failed_queries counter (or add it to the
function's return dict), and log the exception details using the existing logger
(e.g., logger.exception or logger.error with the exception) so failures are
observable; ensure the function signature/return value is updated to include the
failed count where callers can access it.

73-75: Consider logging the exception for debugging.

Catching a blind Exception and only storing str(e) loses the traceback. For a diagnostic tool, having more context in case of failures would be helpful for troubleshooting.

💡 Optional: Add logging for better diagnostics

+import logging
+
+logger = logging.getLogger(__name__)
+
 # In the exception handler:
         except Exception as e:
+            logger.debug("Query failed: %s", query, exc_info=True)
             per_query.append({"query": query, "error": str(e)})
             continue

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/bench/health_check.py` around lines 73 - 75, The except block that
appends per_query.append({"query": query, "error": str(e)}) discards traceback
info; add structured logging there by ensuring a logger exists (e.g., logger =
logging.getLogger(__name__) at module top) and call logger.exception(...) or
logger.error(..., exc_info=True) inside the except to record the full traceback
alongside keeping the existing per_query entry (update the except in the
function that iterates queries to call logger.exception(f"Error executing query:
{query}") or similar).

14-14: Unused import: math

The math module is imported but never used in this file.

🧹 Remove unused import

 import argparse
 import json
-import math
 import os

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/bench/health_check.py` at line 14, Remove the unused import of the
math module by deleting the "import math" statement at the top of the file;
verify there are no remaining references to math in functions or variables
(e.g., any health check helpers) and run the linter/formatter to ensure the
change is clean.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/docs-dispatch.yml:
- Around line 30-35: The MAP assignment uses command substitution and then
checks $? (MAP=$(curl -sf ...); if [ $? -ne 0 ] ...), which is unreachable under
GitHub Actions' default errexit; change the flow so the curl invocation is
executed inside a conditional (i.e., use an if/then that runs curl into MAP and
handles the failure branch directly) instead of relying on checking $?, and
remove the separate if [ $? -ne 0 ] check; update references to MAP and the
failure-handling branch to still echo the skip message and set affected=none
when curl fails.

In `@benchmarks/EXPERIMENT_LOG.md`:
- Around line 13-23: The LoCoMo-mini question count is inconsistent: update
either the tier entry "`locomo-mini` (2 convos, 198 Qs)" or the Results cell
"76.97% (234/304)" so both show the same question total; specifically, edit the
two occurrences— the tier line containing `locomo-mini` and the Results row cell
under "LoCoMo-mini" (currently "76.97% (234/304)")—to use the correct,
reconciled question count and adjust the numerator/percentage if needed so the
percentage and raw counts match the chosen total.

In `@scripts/bench/health_check.py`:
- Line 124: The p95_ms calculation can index past the end for small samples;
replace the direct index int(len(latencies)*0.95) with a clamped index, e.g.
compute idx = min(len(latencies) - 1, int(len(latencies) * 0.95)) (keeping the
existing guard for empty lists) and use sorted(latencies)[idx] for "p95_ms" so
the code using the latencies variable never raises IndexError; update the
expression where "p95_ms" is set to use this clamped index.

In `@tests/benchmarks/test_locomo.py`:
- Around line 578-584: The comparison currently strips timezone info via
q_date.replace(tzinfo=None) and m_date.replace(tzinfo=None) which removes
offsets without converting; instead, normalize both q_date and m_date to UTC
first (e.g., q_date.astimezone(datetime.timezone.utc) and
m_date.astimezone(datetime.timezone.utc)) and then drop tzinfo before computing
days_diff so equivalent instants in different zones compare correctly; update
the loop handling of question_dates, memory_dates, q_date and m_date accordingly
while keeping the tolerance_days comparison logic unchanged.

---

Outside diff comments:
In `@automem/api/recall.py`:
- Around line 1450-1515: The response's score_filter.filtered_count currently
sets filtered_count to 0 when only the adaptive floor (score_floor_applied)
removed results because it uses "if min_score else 0"; update the logic so
filtered_count always equals pre_filter_count - len(results) whenever either
min_score or score_floor_applied is true. In other words, in recall.py where the
response["score_filter"] is built (symbols: score_filter, min_score,
score_floor_applied, pre_filter_count, results), replace the conditional
expression for filtered_count with filtered_count = pre_filter_count -
len(results) so adaptive-floor-only and combined filters are reported correctly.

---

Nitpick comments:
In `@scripts/bench/health_check.py`:
- Around line 212-213: The except Exception: continue in health_check.py
silently swallows errors — update the error handling in the try/except block
(the block catching Exception) to record failures instead of just continuing:
increment a failed_queries counter (or add it to the function's return dict),
and log the exception details using the existing logger (e.g., logger.exception
or logger.error with the exception) so failures are observable; ensure the
function signature/return value is updated to include the failed count where
callers can access it.
- Around line 73-75: The except block that appends per_query.append({"query":
query, "error": str(e)}) discards traceback info; add structured logging there
by ensuring a logger exists (e.g., logger = logging.getLogger(__name__) at
module top) and call logger.exception(...) or logger.error(..., exc_info=True)
inside the except to record the full traceback alongside keeping the existing
per_query entry (update the except in the function that iterates queries to call
logger.exception(f"Error executing query: {query}") or similar).
- Line 14: Remove the unused import of the math module by deleting the "import
math" statement at the top of the file; verify there are no remaining references
to math in functions or variables (e.g., any health check helpers) and run the
linter/formatter to ensure the change is clean.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 80a6f93 and 82b1c3d.

📒 Files selected for processing (11)

.github/workflows/docs-dispatch.yml
.gitignore
Makefile
automem/api/recall.py
automem/config.py
benchmarks/EXPERIMENT_LOG.md
benchmarks/baselines/health_baseline.json
benchmarks/baselines/locomo-mini_baseline.json
scripts/bench/health_check.py
scripts/bench/restore_and_eval.sh
tests/benchmarks/test_locomo.py

coderabbitai

🧹 Nitpick comments (2)

scripts/bench/health_check.py (2)

315-317: Consider passing token as parameter instead of mutating global state.

Using global API_TOKEN works but is a code smell. For a standalone CLI script this is acceptable, but if the module grows or is imported elsewhere, passing the token through function parameters would be cleaner.

This is a minor observation—the current approach is fine for this utility script's scope.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/bench/health_check.py` around lines 315 - 317, The current code sets
a module-level API_TOKEN by mutating the global variable API_TOKEN when
args.api_token is present; instead, refactor to pass the token as an explicit
parameter to the functions that need it (e.g., the functions that currently read
API_TOKEN such as health_check(), make_request(), or any helper that uses
API_TOKEN), remove the global assignment (delete the global API_TOKEN =
args.api_token block), and update the CLI entrypoint (where args are parsed) to
call those functions with args.api_token (or None) so token state is passed
explicitly rather than relying on global mutation.

221-222: Silent exception swallowing reduces observability.

Unlike check_score_distribution which records errors in per_query, this function silently continues when a request fails. Consider logging the exception or tracking failed queries in the output for debugging.

🔧 Suggested improvement for error visibility

+    failed_queries: List[str] = []
     for query in DIVERSE_QUERIES[:3]:
         try:
             resp = requests.get(
                 f"{base_url}/recall",
                 params={"query": query, "limit": 5},
                 headers=get_headers(),
                 timeout=30,
             )
             resp.raise_for_status()
             data = resp.json()
-        except Exception:
+        except Exception as e:
+            failed_queries.append(f"{query[:40]}: {e}")
             continue

Then include "failed_queries": failed_queries in the returned dict.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/bench/health_check.py` around lines 221 - 222, The except Exception:
block that currently just does "continue" should record and surface failures:
catch the exception, log it (e.g., logging.exception or logger.error with
traceback) and append the current query identifier to a failed_queries list
(create it if missing) instead of silently skipping; also ensure the function
returns the failed_queries list by adding "failed_queries": failed_queries to
the output dict, mirroring how check_score_distribution records errors.
Reference the existing except Exception: block and the failed_queries variable
name to locate where to add logging and appending and update the returned dict
accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@scripts/bench/health_check.py`:
- Around line 315-317: The current code sets a module-level API_TOKEN by
mutating the global variable API_TOKEN when args.api_token is present; instead,
refactor to pass the token as an explicit parameter to the functions that need
it (e.g., the functions that currently read API_TOKEN such as health_check(),
make_request(), or any helper that uses API_TOKEN), remove the global assignment
(delete the global API_TOKEN = args.api_token block), and update the CLI
entrypoint (where args are parsed) to call those functions with args.api_token
(or None) so token state is passed explicitly rather than relying on global
mutation.
- Around line 221-222: The except Exception: block that currently just does
"continue" should record and surface failures: catch the exception, log it
(e.g., logging.exception or logger.error with traceback) and append the current
query identifier to a failed_queries list (create it if missing) instead of
silently skipping; also ensure the function returns the failed_queries list by
adding "failed_queries": failed_queries to the output dict, mirroring how
check_score_distribution records errors. Reference the existing except
Exception: block and the failed_queries variable name to locate where to add
logging and appending and update the returned dict accordingly.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 82b1c3d and 7b43f24.

📒 Files selected for processing (3)

.github/workflows/docs-dispatch.yml
automem/api/recall.py
scripts/bench/health_check.py

🚧 Files skipped from review as they are similar to previous changes (2)

.github/workflows/docs-dispatch.yml
automem/api/recall.py

jack-arturo · 2026-03-02T16:26:21Z

Resolving CodeRabbit comment #2873293369

✅ Addressed and implemented

Note: Addressed in fcbe5c5: updated Tier 1 LoCoMo-mini benchmark row to 304 questions so it matches the 234/304 result denominator.

Resolved via CodeRabbit MCP

jack-arturo · 2026-03-02T16:26:24Z

Resolving CodeRabbit comment #2873293381

✅ Addressed and implemented

Note: Addressed in fcbe5c5: normalized extracted dates to UTC before comparison in match_dates_fuzzy and compute tolerance using total_seconds/86400.

Resolved via CodeRabbit MCP

jack-arturo · 2026-03-02T16:26:51Z

Conversation resolved: Resolved by fcbe5c5: locomo-mini tier question count aligned with benchmark denominator.

Resolved via CodeRabbit MCP

jack-arturo · 2026-03-02T16:26:55Z

Conversation resolved: Resolved by fcbe5c5: date matching now normalizes parsed datetimes to UTC before tolerance comparison.

Resolved via CodeRabbit MCP

jack-arturo · 2026-03-02T16:27:47Z

@coderabbitai review

coderabbitai · 2026-03-02T16:27:56Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

🤖 I have created a release *beep* *boop* --- ## [0.13.0](v0.12.0...v0.13.0) (2026-03-02) ### Features * **bench:** benchmark testing infrastructure for rapid iteration ([#97](#97)) ([80a6f93](80a6f93)) * **recall:** add min_score threshold and adaptive floor filtering ([#73](#73)) ([#101](#101)) ([8df3c08](8df3c08)) * **viewer:** add standalone graph-viewer runtime files ([5bcb6db](5bcb6db)) * **viewer:** consolidate stable core and split-ready compatibility ([#94](#94)) ([958da72](958da72)) * **viewer:** externalize visualizer with /viewer compatibility routes ([29bafcf](29bafcf)) * **viewer:** merge visualizer stable core branch ([96b27bf](96b27bf)) ### Bug Fixes * FalkorDB data not persisting across restarts ([3bbc834](3bbc834)) * FalkorDB data not persisting across restarts ([#99](#99)) ([8490d36](8490d36)) * **mcp-sse:** sync tool schemas for SSE/MCP parity ([#104](#104)) ([d99b86d](d99b86d)) ### Documentation * **bench:** add PR [#73](#73), [#80](#80), and [#87](#87) experiment results ([#103](#103)) ([8533fac](8533fac)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

jack-arturo added 3 commits March 2, 2026 17:37

docs: update experiment log with #73 results

82b1c3d

Made-with: Cursor

coderabbitai Bot added the enhancement New feature or request label Mar 2, 2026

coderabbitai Bot reviewed Mar 2, 2026

View reviewed changes

Comment thread .github/workflows/docs-dispatch.yml Outdated

Comment thread benchmarks/EXPERIMENT_LOG.md Outdated

Comment thread scripts/bench/health_check.py Outdated

Comment thread tests/benchmarks/test_locomo.py Outdated

fix(recall): address score-filter review findings

7b43f24

coderabbitai Bot reviewed Mar 2, 2026

View reviewed changes

jack-arturo added 2 commits March 2, 2026 21:52

chore(merge): resolve conflicts with main

8897cf8

fix(bench): align locomo-mini counts and normalize dates to UTC

fcbe5c5

jack-arturo merged commit 8df3c08 into main Mar 2, 2026
7 checks passed

jack-arturo deleted the exp/73-min-score-threshold branch March 2, 2026 16:34

jack-arturo mentioned this pull request Mar 2, 2026

chore(main): release 0.13.0 #95

Merged

coderabbitai Bot mentioned this pull request Mar 10, 2026

fix(benchmarks): correct evaluator bugs, add agent guidelines, establish honest baseline #117

Merged

4 tasks

flintfromthebasement mentioned this pull request Mar 20, 2026

recall: Keyword scoring dead for vector results + adaptive floor too aggressive #128

Closed

Conversation

jack-arturo commented Mar 2, 2026

Summary

Benchmark Results

Health Check

Test Plan

Uh oh!

coderabbitai Bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

jack-arturo commented Mar 2, 2026

Uh oh!

coderabbitai Bot commented Mar 2, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

jack-arturo commented Mar 2, 2026

Uh oh!

jack-arturo commented Mar 2, 2026

Uh oh!

jack-arturo commented Mar 2, 2026

Uh oh!

jack-arturo commented Mar 2, 2026

Uh oh!

jack-arturo commented Mar 2, 2026

Uh oh!

coderabbitai Bot commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Mar 2, 2026 •

edited

Loading