Skip to content

ff-680 (gh-507) — Allowed unicode characters in the native feed tags to support non-English tags that are owned not by Feeds Fun#508

Merged
Tiendil merged 11 commits intomainfrom
feature/ff-680-gibberish-native-tags
Mar 31, 2026
Merged

ff-680 (gh-507) — Allowed unicode characters in the native feed tags to support non-English tags that are owned not by Feeds Fun#508
Tiendil merged 11 commits intomainfrom
feature/ff-680-gibberish-native-tags

Conversation

@Tiendil
Copy link
Copy Markdown
Owner

@Tiendil Tiendil commented Mar 31, 2026

No description provided.

Copilot AI review requested due to automatic review settings March 31, 2026 17:00
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates tag normalization to preserve Unicode characters for “final” (non-normalized) tags—especially native feed tags—so non-English feed-provided tags can be kept as-is while still slugifying consistently.

Changes:

  • Add allow_unicode support to ffun.tags.converters.normalize() and expand converter test coverage for Unicode normalization equivalence.
  • Move “mode from categories” logic into ffun.tags.domain.mode_from_categories() and make TagInNormalization.mode an explicit field set during prepare_for_normalization().
  • Update normalizer/domain tests to pass mode explicitly and validate final tags skip normalizers.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
ffun/ffun/tags/entities.py Replaces computed mode property with an explicit mode field on TagInNormalization.
ffun/ffun/tags/domain.py Introduces mode_from_categories() and uses it to decide Unicode allowance during prepare_for_normalization().
ffun/ffun/tags/converters.py Adds allow_unicode parameter to slugification and wraps output in TagUid.
ffun/ffun/tags/tests/test_converters.py Adds test matrix for Unicode vs ASCII slugification and normalization-form equivalence.
ffun/ffun/tags/tests/test_domain.py Adds tests for mode_from_categories, Unicode behavior by mode, and “final tags skip normalizers”.
ffun/ffun/tags/tests/test_entities.py Removes the now-obsolete test that relied on TagInNormalization.mode being computed.
ffun/ffun/tags/normalizers/tests/test_splitter.py Extends splitter test cases to cover Unicode slugs and non-English separators; passes mode.
ffun/ffun/tags/normalizers/tests/test_part_replacer.py Extends replacements/test cases to cover Unicode; passes mode.
ffun/ffun/tags/normalizers/tests/test_part_blacklist.py Extends blacklist/test cases to cover Unicode; passes mode.
ffun/ffun/tags/normalizers/tests/test_form_normalizer.py Ensures form normalizer handling is safe for Unicode-preserved tags; sets mode.
ffun/ffun/tags/normalizers/tests/test_base.py Updates fixtures to include mode.
changes/unreleased.md Removes the old “No changes.” placeholder entry.
changes/next_release.md Adds a release note entry for ff-680 / gh-507.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.



def normalize(tag: str) -> TagUid:
# TODO: add tests for allow_unicode behaviour
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline TODO about adding tests for allow_unicode behavior looks outdated now that this PR adds normalize(..., allow_unicode=...) test coverage in ffun/tags/tests/test_converters.py. Please remove or update the TODO to avoid misleading future readers.

Suggested change
# TODO: add tests for allow_unicode behaviour
# NOTE: allow_unicode behaviour is covered by tests in ffun/tags/tests/test_converters.py

Copilot uses AI. Check for mistakes.
Tiendil and others added 4 commits March 31, 2026 19:09
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@Tiendil Tiendil merged commit 0306f68 into main Mar 31, 2026
2 checks passed
@Tiendil Tiendil deleted the feature/ff-680-gibberish-native-tags branch March 31, 2026 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants