Tune ML worker OCR subject normalization#104
Conversation
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughThe PR extends the ML worker parser to apply default alias normalization when no custom tuning profile is present. It expands default subject aliases with OCR/typo variants, integrates default normalization into the parser fallback logic, and adds tests validating the normalization behavior. ChangesDefault Alias Normalization
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
SahilKumar75
left a comment
There was a problem hiding this comment.
ML review: this keeps the change scoped to deterministic parser tuning. The expanded default aliases cover OCR digit/letter confusions for subject names, and the parser now applies default faculty/location aliases in the no-custom-profile path. Tests cover the noisy subject, faculty, and lab-location cases.
Summary
Testing
Closes #103
Summary by CodeRabbit
Release Notes
Improvements
Tests