fix(e2e): resolve test failures from issue #244#245
Conversation
- filesystem backend: atomic writes + graceful recovery from corrupt JSON - codex subscribe: warn on audit-append failure instead of rolling back - shared config: invalid-repo-name warning goes to stdout - segmentation test: mark flaky to absorb LLM variance - cross-namespace sharing test: check leaked content, not echoed query - .gitignore: exclude .codex
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughNamespace persistence now uses atomic temp-file writes and stricter JSON handling; empty or corrupt namespace files are treated as missing and recovered. Added tests for these behaviors. Minor logging/message wording changes in two plugin configs and tests updated to expect messages on stderr. ChangesGit ignore
Filesystem namespace backend (atomic-save and recovery)
Plugin config validation message changes
Subscription audit logging behavior
Tests — expectations and flakiness handling
Sequence Diagram(s)sequenceDiagram
participant Client
participant Backend
participant OS
Client->>Backend: ensure_namespace(ns)
Backend->>OS: read_text(ns.json)
OS-->>Backend: file content (empty / corrupt / valid)
alt empty or corrupt content
Backend->>OS: unlink(ns.json)
Backend-->>Client: raise NamespaceNotFoundException
Client->>Backend: create_namespace(ns) / recover
end
Client->>Backend: save namespace update
Backend->>OS: write temp file (ns.json.<uuid>.tmp)
OS-->>Backend: temp file written
Backend->>OS: os.replace(temp -> ns.json) (atomic swap)
OS-->>Backend: replace success
Backend-->>Client: save confirmed
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 43 minutes and 1 second.Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
platform-integrations/claw-code/plugins/evolve-lite/lib/config.py (1)
339-352:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winInconsistent output streams for sibling warnings in
_coerce_repo.The invalid-name warning (lines 339–341) now writes to
stdout, while the invalid-scope warning (lines 349–352) still writes tostderr. Both are diagnostic/warning messages emitted from the same function for invalid config entries — they should target the same stream.If the change was driven by a test that captures
stdout, consider capturingstderrin the test instead (which is the conventional target for warnings) rather than routing a warning tostdout.🔧 Proposed fix — restore `stderr` for the invalid-name path
if not is_valid_repo_name(name.strip()): print( - f"evolve-lite: {name!r} (skipped - invalid subscription name) — only A-Z, a-z, 0-9, '.', '_', '-' allowed" + f"evolve-lite: {name!r} (skipped - invalid subscription name) — only A-Z, a-z, 0-9, '.', '_', '-' allowed", + file=sys.stderr, ) return None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@platform-integrations/claw-code/plugins/evolve-lite/lib/config.py` around lines 339 - 352, The invalid-name diagnostic in _coerce_repo currently prints to stdout while the invalid-scope diagnostic prints to stderr, causing inconsistent output streams; change the print in the invalid-name branch that references name to write to sys.stderr (matching the other warning), keep the return None behavior, and ensure you import or reference sys the same way as the scope branch so both warnings (the print that mentions "skipped - invalid subscription name" and the print that mentions "unknown scope") go to stderr; symbols to locate: function _coerce_repo, variables name/remote/scope, VALID_SCOPES, and the two print calls.platform-integrations/claude/plugins/evolve-lite/lib/config.py (1)
339-352:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winSame inconsistent output-stream issue as in
claw-code/plugins/evolve-lite/lib/config.py.Identical concern: the invalid-name
stdoutwhile the invalid-scopestderr. Both paths should use the same stream. See the proposed fix in the claw-code file review above and apply it here as well.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@platform-integrations/claude/plugins/evolve-lite/lib/config.py` around lines 339 - 352, The two diagnostic print calls in this module are inconsistent: the invalid-name print (using name) currently writes to stdout while the invalid-scope print (using entry.get("scope") and VALID_SCOPES) writes to sys.stderr; change the invalid-name print so it also writes to sys.stderr (i.e., add file=sys.stderr) to match the invalid-scope branch and ensure both error messages go to the same stream; locate the prints around the name validation and the scope validation (they reference name, entry, scope and VALID_SCOPES) and modify only the print for the invalid-name case to include file=sys.stderr.
🧹 Nitpick comments (1)
altk_evolve/backend/filesystem.py (1)
73-75: ⚡ Quick winUse a unique tmp filename per write to be safe under multi-process deployments.
The deterministic
{namespace_id}.json.tmpname is protected bythreading.Lockwithin a single process, but not across OS-level workers. If two processes ever write the same namespace concurrently (e.g., uvicorn with--workers > 1), Writer B can overwrite the tmp file after Writer A wrote it but before Writer A callsos.replace, causing Writer A to atomically commit Writer B's (possibly partial) content.Using a uniquely-named tmp file keeps the same-directory guarantee that
os.replaceneeds while eliminating the race:♻️ Proposed fix using `tempfile`
+import tempfile ... def _save_namespace_data(self, namespace_id: str, data: FilesystemNamespace): """Save namespace data to JSON file atomically. ...""" file_path = self._namespace_file(namespace_id) - tmp_path = file_path.with_suffix(file_path.suffix + ".tmp") - tmp_path.write_text(data.model_dump_json(indent=2)) - os.replace(tmp_path, file_path) + fd, tmp_name = tempfile.mkstemp(dir=file_path.parent, suffix=".tmp") + try: + with os.fdopen(fd, "w", encoding="utf-8") as f: + f.write(data.model_dump_json(indent=2)) + os.replace(tmp_name, file_path) + except Exception: + os.unlink(tmp_name) + raiseThis also adds the
encoding="utf-8"spec missing from bothwrite_textandread_text(line 55) calls — relevant if any entity content contains non-ASCII characters on Windows where the default locale encoding may not be UTF-8.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@altk_evolve/backend/filesystem.py` around lines 73 - 75, The current write uses a deterministic tmp file (tmp_path = file_path.with_suffix(...)) which races across processes; change the write in the function/method that uses tmp_path/file_path so it creates a unique temp file in the same directory (e.g., use tempfile.NamedTemporaryFile or tempfile.mkstemp with dir=file_path.parent and a unique suffix), write the JSON with encoding="utf-8" (use write() or write_text(..., encoding="utf-8")), flush/close the temp file, then call os.replace(temp_path, file_path) to atomically commit; also update the corresponding read_text call to include encoding="utf-8". Ensure the temp file is removed on exceptions if not replaced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@platform-integrations/claude/plugins/evolve-lite/lib/config.py`:
- Around line 339-352: The two diagnostic print calls in this module are
inconsistent: the invalid-name print (using name) currently writes to stdout
while the invalid-scope print (using entry.get("scope") and VALID_SCOPES) writes
to sys.stderr; change the invalid-name print so it also writes to sys.stderr
(i.e., add file=sys.stderr) to match the invalid-scope branch and ensure both
error messages go to the same stream; locate the prints around the name
validation and the scope validation (they reference name, entry, scope and
VALID_SCOPES) and modify only the print for the invalid-name case to include
file=sys.stderr.
In `@platform-integrations/claw-code/plugins/evolve-lite/lib/config.py`:
- Around line 339-352: The invalid-name diagnostic in _coerce_repo currently
prints to stdout while the invalid-scope diagnostic prints to stderr, causing
inconsistent output streams; change the print in the invalid-name branch that
references name to write to sys.stderr (matching the other warning), keep the
return None behavior, and ensure you import or reference sys the same way as the
scope branch so both warnings (the print that mentions "skipped - invalid
subscription name" and the print that mentions "unknown scope") go to stderr;
symbols to locate: function _coerce_repo, variables name/remote/scope,
VALID_SCOPES, and the two print calls.
---
Nitpick comments:
In `@altk_evolve/backend/filesystem.py`:
- Around line 73-75: The current write uses a deterministic tmp file (tmp_path =
file_path.with_suffix(...)) which races across processes; change the write in
the function/method that uses tmp_path/file_path so it creates a unique temp
file in the same directory (e.g., use tempfile.NamedTemporaryFile or
tempfile.mkstemp with dir=file_path.parent and a unique suffix), write the JSON
with encoding="utf-8" (use write() or write_text(..., encoding="utf-8")),
flush/close the temp file, then call os.replace(temp_path, file_path) to
atomically commit; also update the corresponding read_text call to include
encoding="utf-8". Ensure the temp file is removed on exceptions if not replaced.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ff5fdb0b-f09a-4c15-8202-8ffb15eee833
📒 Files selected for processing (7)
.gitignorealtk_evolve/backend/filesystem.pyplatform-integrations/claude/plugins/evolve-lite/lib/config.pyplatform-integrations/claw-code/plugins/evolve-lite/lib/config.pyplatform-integrations/codex/plugins/evolve-lite/skills/subscribe/scripts/subscribe.pytests/e2e/test_e2e_segmentation.pytests/e2e/test_sharing.py
There was a problem hiding this comment.
🧹 Nitpick comments (1)
platform-integrations/claude/plugins/evolve-lite/lib/config.py (1)
338-351: ⚡ Quick winInconsistent output streams for sibling warnings within
_coerce_repo.The invalid-name path (line 339) now emits to stdout via bare
print(), while the invalid-scope path (lines 347-350) still emits to stderr viaprint(..., file=sys.stderr). Both are diagnostics for malformed config entries in the same function; routing them to different streams is surprising to callers.As noted in the relevant context,
sync.pyalready usescapture_output=Truefor subprocess calls. If any parent process captures these script invocations with stdout capture, the invalid-name warning will be inadvertently swallowed into the captured output buffer rather than flowing to the diagnostic stream.If the test suite change that prompted this was asserting on stdout, consider redirecting both warnings to stderr and updating the tests to check
capsys.readouterr().err(or equivalent), which keeps all config diagnostics on the standard diagnostic channel.♻️ Proposed alignment (both warnings → stderr)
if not is_valid_repo_name(name.strip()): - print(f"evolve-lite: {name!r} (skipped - invalid subscription name) — only A-Z, a-z, 0-9, '.', '_', '-' allowed") + print( + f"evolve-lite: {name!r} (skipped - invalid subscription name) — only A-Z, a-z, 0-9, '.', '_', '-' allowed", + file=sys.stderr, + ) return None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@platform-integrations/claude/plugins/evolve-lite/lib/config.py` around lines 338 - 351, In _coerce_repo, the invalid-name warning currently uses print(...) to stdout while the invalid-scope warning prints to stderr; change the invalid-name branch that prints "evolve-lite: {name!r} (skipped - invalid subscription name) ..." to write to stderr (e.g., use print(..., file=sys.stderr)) so both diagnostics go to stderr, and update any tests that asserted on stdout to assert on stderr (capsys.readouterr().err) as needed; reference symbols: function _coerce_repo, variables name and entry, and VALID_SCOPES.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@platform-integrations/claude/plugins/evolve-lite/lib/config.py`:
- Around line 338-351: In _coerce_repo, the invalid-name warning currently uses
print(...) to stdout while the invalid-scope warning prints to stderr; change
the invalid-name branch that prints "evolve-lite: {name!r} (skipped - invalid
subscription name) ..." to write to stderr (e.g., use print(...,
file=sys.stderr)) so both diagnostics go to stderr, and update any tests that
asserted on stdout to assert on stderr (capsys.readouterr().err) as needed;
reference symbols: function _coerce_repo, variables name and entry, and
VALID_SCOPES.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7a4305af-488c-4d5a-a4b3-343458c79cd7
📒 Files selected for processing (2)
platform-integrations/claude/plugins/evolve-lite/lib/config.pyplatform-integrations/claw-code/plugins/evolve-lite/lib/config.py
✅ Files skipped from review due to trivial changes (1)
- platform-integrations/claw-code/plugins/evolve-lite/lib/config.py
Codex ReviewTarget: branch diff against The filesystem backend hardening introduces recovery/concurrency gaps around namespace writes. These can still break startup or writes in realistic interrupted-write or multi-process scenarios. Full review comments
Generated by Codex (session |
- Unique tmp filenames for namespace saves: concurrent CLI/MCP writers sharing data_dir no longer clobber each other's <ns>.json.tmp, which caused FileNotFoundError at os.replace time. - Unlink empty/corrupt namespace files when classifying as missing, so the create_namespace() call inside ensure_namespace() doesn't trip on the stale file and raise NamespaceAlreadyExistsException, which left startup wedged despite the "treat as missing" recovery branch. Adds tests/unit/test_filesystem_backend.py covering both paths.
@gaodan-fang Addressed issues in the latest commit 97a3e02 |
Codex Re-review (post-update)Target: branch diff against Prior comments status
New issue found
No additional blocking issues surfaced in the inspected areas. Generated by Codex (session |
Codex Review (round 3)Target: branch diff against Two P2 issues remain — one new, one still unaddressed from the prior round. Findings
No additional blocking issues surfaced this pass. Generated by Codex (session |
|
comment #245 (comment) is addressed. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tests/unit/test_filesystem_backend.py (1)
33-40: ⚡ Quick winConsider adding a
ValidationErrorrecovery test to complement the corrupt-JSON case.The existing
test_ensure_namespace_recovers_from_corrupt_jsoncovers syntactically invalid JSON ({not json). Once thepydantic.ValidationErrorgap in_load_namespace_datais fixed (see the comment onfilesystem.py), add a parallel test for structurally-invalid-but-valid-JSON (e.g.{"id": "ns_valid_json", "entities": "not_a_list"}) to lock in that recovery path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/test_filesystem_backend.py` around lines 33 - 40, Add a new unit test alongside test_ensure_namespace_recovers_from_corrupt_json that creates a namespace JSON file containing syntactically valid JSON but wrong schema (e.g. {"id": "ns_valid_json", "entities": "not_a_list"}) to exercise the pydantic.ValidationError path in _load_namespace_data; then call client.ensure_namespace("ns_valid_json") and assert the returned namespace has id == "ns_valid_json" and num_entities == 0, ensuring the code recovers from schema validation errors the same way it recovers from corrupt JSON.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@altk_evolve/backend/filesystem.py`:
- Around line 65-70: The code currently only catches json.JSONDecodeError when
calling FilesystemNamespace.model_validate(json.loads(raw)), but
pydantic.ValidationError can also be raised for structurally invalid (but
syntactically valid) JSON; update the except block around
FilesystemNamespace.model_validate to catch pydantic.ValidationError as well,
log the same warning (including the exception), unlink the file_path, and raise
NamespaceNotFoundException(f"Namespace `{namespace_id}` not found") from the
caught ValidationError so ensure_namespace in evolve_client.py can perform the
intended recovery path.
In `@tests/unit/test_filesystem_backend.py`:
- Around line 21-53: Add the pytest "unit" marker to the three tests by
decorating each function (test_ensure_namespace_recovers_from_zero_byte_file,
test_ensure_namespace_recovers_from_corrupt_json,
test_save_tolerates_stale_shared_tmp) with `@pytest.mark.unit`; if pytest is not
already imported in the module, add "import pytest" at the top so the decorator
resolves. Ensure the decorator is placed immediately above each def line and
keep existing docstrings and asserts unchanged.
---
Nitpick comments:
In `@tests/unit/test_filesystem_backend.py`:
- Around line 33-40: Add a new unit test alongside
test_ensure_namespace_recovers_from_corrupt_json that creates a namespace JSON
file containing syntactically valid JSON but wrong schema (e.g. {"id":
"ns_valid_json", "entities": "not_a_list"}) to exercise the
pydantic.ValidationError path in _load_namespace_data; then call
client.ensure_namespace("ns_valid_json") and assert the returned namespace has
id == "ns_valid_json" and num_entities == 0, ensuring the code recovers from
schema validation errors the same way it recovers from corrupt JSON.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 62d2fa42-654e-47f5-8f6a-96c3b9464c6a
📒 Files selected for processing (7)
altk_evolve/backend/filesystem.pyplatform-integrations/claude/plugins/evolve-lite/lib/config.pyplatform-integrations/claw-code/plugins/evolve-lite/lib/config.pytests/platform_integrations/test_bob_sharing.pytests/platform_integrations/test_codex_sharing.pytests/platform_integrations/test_sync.pytests/unit/test_filesystem_backend.py
✅ Files skipped from review due to trivial changes (1)
- platform-integrations/claw-code/plugins/evolve-lite/lib/config.py
🚧 Files skipped from review as they are similar to previous changes (1)
- platform-integrations/claude/plugins/evolve-lite/lib/config.py
Created issue #246 for this bug. |
Conflict-resolution notes (visahak's "address conflicts" ask on PR #235): * Main's PR #245 ("fix(e2e): resolve test failures from issue #244") modified the rendered platform-integrations/<claude,claw-code>/lib/ config.py and platform-integrations/codex/.../subscribe.py directly. Under this branch's unified layout these are generated artifacts — the fixes belong in plugin-source/ so they propagate to every platform on render. Two changes re-applied: 1. plugin-source/lib/config.py — print a stderr warning when _coerce_repo() rejects an invalid repo name (was silent). Updated message wording to match #245. Re-rendered to all four platforms (claude/claw-code/codex/bob); codex+bob were never changed by main but inherit consistent behavior under unify. 2. plugin-source/skills/evolve-lite/subscribe/scripts/subscribe.py — drop the rollback-on-audit-failure block; warn instead. Audit logging is best-effort, a failed append shouldn't undo a successful subscribe. Re-rendered to all four platforms; previously only codex had this fix. * .gitignore — kept both new entries (`example-guidelines` from this branch, `.codex` from main). * altk_evolve/backend/filesystem.py + tests/unit/test_filesystem_backend.py — accepted from main as-is (atomic writes + corrupt-JSON recovery for the filesystem backend). * tests/e2e/, tests/platform_integrations/test_{bob,codex}_sharing.py, tests/platform_integrations/test_sync.py — auto-merged cleanly, inspected, accepted. Verified: render-then-check is clean; 186 platform_integrations tests pass; the 4 new filesystem-backend unit tests pass.
Description:
Summary
Changes
altk_evolve/backend/filesystem.py— atomic writes via tmp +os.replace; graceful recovery from empty/corrupt JSONaltk_evolve/backend/filesystem.py— atomic writes via tmp +os.replace; graceful recovery from empty/corrupt JSONplatform-integrations/codex/.../subscribe.py— audit-append failure now warns and continues instead of rolling back with exit 1platform-integrations/{claude,claw-code}/.../lib/config.py— invalid-repo-name warning prints to stdout with "(skipped - invalid subscription name)" wordingtests/e2e/test_e2e_segmentation.py—@pytest.mark.flaky(retries=2, delay=1)to absorb LLM variancetests/e2e/test_sharing.py— assertion now checks leaked entity content +[public: alice]marker instead of the query term, which always echoes in the response header.gitignore— exclude.codexTest plan
test_subscribe_warns_when_audit_write_failspasses; fulltest_codex_sharing.pygreenAddresses issue #244
Summary by CodeRabbit
Bug Fixes
Tests
Chores