Skip to content

Fix case-insensitive language code matching#287

Open
rathboma wants to merge 3 commits intountra:mainfrom
rathboma:fix/case-insensitive-language-codes
Open

Fix case-insensitive language code matching#287
rathboma wants to merge 3 commits intountra:mainfrom
rathboma:fix/case-insensitive-language-codes

Conversation

@rathboma
Copy link
Copy Markdown
Contributor

@rathboma rathboma commented Jan 21, 2026

Summary

Fixes case-sensitivity bug where documents with language codes in different cases (e.g., lang: pt-br) were not processed when config had different casing (e.g., languages: ['pt-BR']).

Problem

Users experienced:

  • Documents silently skipped during coordination
  • Files generated in _site/pt-BR/ but canonical URLs using /pt-br/
  • 404 errors when navigating
  • Broken hreflang alternate links

Solution

Uses simple inline .downcase comparisons at the point of matching — no extra data structures or helper methods. When a document's lang from frontmatter or path matches a config language case-insensitively, it's normalized to the config case.

Changes

  • site.rb: derive_lang_from_path() and coordinate_documents() use inline .downcase to find the matching config language
  • Tests: Added tests for case-insensitive path derivation and document coordination

No changes to coordinate.rb or i18n_headers.rb — normalization happens upstream in coordinate_documents.

Testing

bundle exec rspec spec/jekyll/polyglot/patches/jekyll/site_spec.rb
# 40 examples, 0 failures

Backward Compatibility

No breaking changes — existing sites with consistent casing work exactly as before.

Fixes issue where documents with language codes in different cases
(e.g., lang: pt-br) were not processed when config had different
casing (e.g., languages: ['pt-BR']), causing:
- Documents silently skipped during coordination
- Files generated in _site/pt-BR/ but URLs using /pt-br/
- 404 errors and broken hreflang links

Changes:
- Add normalization infrastructure to Site class
- Normalize language codes early in coordinate_documents
- Use case-insensitive matching for all language comparisons
- Preserve original config case for file paths and URLs
- Support case-insensitive data directory lookups

All language code comparisons now use lowercase for matching while
preserving the original case from config for display and file paths.

Tests:
- Add 6 new unit tests for normalization helpers
- Add integration test validating case mismatch scenario
- Update existing test to reflect case-insensitive behavior
- All 44 tests pass with no regressions

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
rathboma added a commit to rathboma/polyglot that referenced this pull request Jan 21, 2026
Merges PR untra#287 fix into combined-features branch.

This adds case-insensitive language code matching, allowing users to use
any case variant (pt-br, pt-BR, PT-BR) and have it work correctly.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

# Conflicts:
#	lib/jekyll/polyglot/liquid/tags/i18n_headers.rb
#	lib/jekyll/polyglot/patches/jekyll/site.rb
#	spec/jekyll/polyglot/patches/jekyll/site_spec.rb
Remove lang_norm_map, languages_normalized, normalize_lang(), and
lang_exists?() in favor of inline .downcase comparisons. Revert
coordinate.rb and i18n_headers.rb to original logic since doc lang
is now normalized at the coordinate_documents level. Simplify tests.
@rathboma rathboma requested a review from untra March 24, 2026 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants