Skip to content

[backend] CorpusBackend filesystem: UnicodeDecodeError partial-content branch in render_index_summary lacks a direct test #313

@dep0we

Description

@dep0we

Context

PR 3 Round 2 adversarial review caught the F3 load-bearing fix in FilesystemCorpusBackend.render_index_summary: when the INDEX file contains non-UTF-8 bytes and UnicodeDecodeError fires, the method re-reads with errors="replace" and prepends a warning comment (<!-- WARNING: INDEX.md contained non-UTF-8 bytes; replaced. -->), matching bundle.py:_safe_read_text byte-for-byte. The fix landed in PR 3 (commit at #304), but Round 3 acknowledged the partial-content branch lacks a direct test (R3-F2). The IRON RULE regression suite exercises the soft-degrade contract but not the partial-content prepended-warning shape.

Evidence

atomic_agents/corpus/filesystem.py -- render_index_summary UnicodeDecodeError catch block. The branch re-reads with errors="replace" and prepends the <!-- WARNING: ... --> comment to the partial rendered content. No test in tests/test_corpus_filesystem_backend.py or tests/test_corpus_protocol_conformance.py writes a non-UTF-8 INDEX.md fixture and asserts the prepended warning shape on the return value. Surfaced as R3-F2 in PR 3 Round 3 adversarial review (acknowledged coverage gap; PR 4 follow-up).

Proposed fix

Add a test that writes an INDEX.md fixture with deliberate non-UTF-8 bytes (e.g., a stray \xff byte) and asserts:

  1. render_index_summary("wiki") returns a non-empty string.
  2. The returned string starts with <!-- WARNING: INDEX.md contained non-UTF-8 bytes; replaced. -->.
  3. The partial content for clean portions of the INDEX is preserved.

Acceptance criteria

  • A test exercises the UnicodeDecodeError branch of render_index_summary directly with a non-UTF-8 INDEX.md fixture.
  • The test asserts the prepended warning comment AND the partial content preservation.
  • uv run pytest tests/test_corpus_filesystem_backend.py -q passes with zero regressions.

Source

PR 3 of #65 (PR #304), Round 3 adversarial finding R3-F2.

Metadata

Metadata

Assignees

No one assigned

    Labels

    backendProtocol-pattern backend abstractions (memory, logs, locks, etc.)polishOperational nice-to-haves

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions