Add to_unicode, permissive, and recovery_mode flags to MARCReader by dchud · Pull Request #80 · dchud/mrrc

dchud · 2026-04-12T21:46:57Z

Summary

Fixes #78 — adds pymarc-compatible to_unicode and permissive kwargs to MARCReader, and exposes mrrc's existing RecoveryMode as a recovery_mode kwarg.

to_unicode: accepted for pymarc compat; mrrc always converts MARC-8 → UTF-8, so False emits a warning but has no effect
permissive=True: yields None for records that fail to parse, matching pymarc's behavior exactly
recovery_mode: exposes mrrc's Rust-native RecoveryMode ("strict", "lenient", "permissive") for salvaging partial data from damaged records
Combining permissive=True with non-strict recovery_mode raises ValueError (conflicting strategies)

No Rust core changes — permissive is handled in the Python wrapper's __next__, recovery_mode is passed through PyO3 to the existing MarcReader::with_recovery_mode().

Test plan

15 new Python tests in test_marcreader_flags.py covering all kwargs, edge cases, and conflict validation
All 775 Rust tests pass
All 611 Python tests pass
Full .cargo/check.sh passes

Bead: bd-y331

🤖 Generated with Claude Code

…oses #78) pymarc-compatible kwargs for MARCReader: - to_unicode: accepted for compat, warns if False (mrrc always converts) - permissive: yields None for bad records instead of raising (pymarc behavior) - recovery_mode: exposes mrrc's RecoveryMode for salvaging partial data Updated migration guide, reading tutorial, and quickstart with error handling documentation. Bead: bd-y331 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codspeed-hq · 2026-04-12T21:50:46Z

Merging this PR will improve performance by 28.68%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 8 improved benchmarks
✅ 52 untouched benchmarks
⏩ 16 skipped benchmarks¹

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	WallTime	`test_process_4_files_sequential`	85.7 ms	75.7 ms	+13.26%
⚡	WallTime	`test_pipeline_parallel_2x_10k_threaded`	50.7 ms	41.8 ms	+21.23%
⚡	WallTime	`test_pipeline_parallel_4x_10k_threaded`	104.4 ms	88.1 ms	+18.54%
⚡	WallTime	`test_process_4_files_parallel_4_threads`	108.7 ms	93.7 ms	+16%
⚡	WallTime	`test_pipeline_sequential_extraction_4x_10k`	105.5 ms	94 ms	+12.17%
⚡	WallTime	`test_pipeline_sequential_4x_10k`	84.5 ms	76.6 ms	+10.28%
⚡	WallTime	`test_file_parallel_4x_10k_with_extraction`	1,096.1 ms	851.8 ms	+28.68%
⚡	WallTime	`test_pipeline_sequential_1x_10k`	20.4 ms	18.4 ms	+11.12%

_{Comparing bd-y331-marcreader-flags (9ec3c31) with main (dd4c671)}

16 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

@acdha

Add MARCReader kwargs (to_unicode, permissive, recovery_mode) and RecoveryMode documentation to python-api.md reference. Update CHANGELOG unreleased section with PRs #79, #80, #82 and credit @acdha. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

dchud self-assigned this Apr 12, 2026

dchud merged commit ae3a684 into main Apr 12, 2026
47 checks passed

This was referenced Apr 12, 2026

Codebase simplification audit: redundant code, duplication, and complexity #81

Closed

Codebase simplification: fix delete_subfield, reduce duplication #82

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add to_unicode, permissive, and recovery_mode flags to MARCReader#80

Add to_unicode, permissive, and recovery_mode flags to MARCReader#80
dchud merged 1 commit intomainfrom
bd-y331-marcreader-flags

dchud commented Apr 12, 2026

Uh oh!

codspeed-hq bot commented Apr 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dchud commented Apr 12, 2026

Summary

Test plan

Uh oh!

codspeed-hq bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by 28.68%

Performance Changes

Footnotes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codspeed-hq bot commented Apr 12, 2026 •

edited

Loading