Conversation
When set to false, hreflang tags are only generated for languages that have actual translations, not for fallback pages that just use the default language content. This improves SEO correctness by not advertising language alternatives that don't actually exist. Default is true (existing behavior) for backward compatibility. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add three tests for hreflang_fallback behavior - Add documentation in README explaining the feature Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Clarify hreflang behavior for fallback pages in README.
This change improves the hreflang_fallback feature by: 1. Searching site.pages in addition to collection documents when looking for translations with matching page_id 2. Falling back to permalink matching when page_id is not set This ensures that standalone pages (not in collections) are properly recognized as translations when using hreflang_fallback: false. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a page doesn't have `lang` set in its frontmatter, assume it belongs to the default language when building the lang_to_permalink hash. This fixes hreflang generation for standalone pages that don't explicitly set their language. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
During non-default language builds, the page context's permalink may include the language prefix (e.g., /es/about), while the stored page data has the base permalink (/about). Strip the active language prefix before matching to ensure documents are found correctly. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Change fallback from `permalink` to `normalized_permalink` for current_permalink, default_lang_permalink, and alt_permalink to ensure consistent behavior when matching documents fails. This fixes the x-default hreflang URL during non-default language builds. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The URL relativization regex was only excluding hreflang="default_lang" and rel="canonical" from being rewritten with language prefixes. This caused hreflang="x-default" URLs to be incorrectly modified. Added hreflang="x-default" to the negative lookbehind pattern to ensure x-default URLs always point to the default language version. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds a new `relativize_canonical` config option (defaults to false) that controls whether canonical URLs from other plugins (like jekyll-seo-tag) are relativized with language prefixes. When `relativize_canonical: true`: - Canonical URLs from external plugins get the language prefix added - Useful when using jekyll-seo-tag alongside polyglot's i18n_headers When `relativize_canonical: false` (default): - Canonical URLs are NOT relativized (preserves backwards compatibility) - Canonical URLs from external plugins remain unchanged This allows sites using jekyll-seo-tag to have their canonical URLs properly prefixed with the active language, solving duplicate canonical tag conflicts when using both plugins together. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
👋 heya @rathboma , thanks for the PR and contribution! I have a big overall suggestion for this, which is I would rather there not be an added config option to maintain backwards compatibility with
the challenge with this ruby project has been writing tests, and you've helped tremendously here adding tests to the PR. bravo sir! finding improvements like this is important. I have some other code suggestions to look into as well. Thanks again for this contribution! feel free to add your site to the readme contribution list if you want! bono cross-promotion is a reward for contribution. |
|
polyglot is due for a patch release with a few misc changes and more tests. I might get this done later this week with these additions, and a new blogpost to announce this update. |
|
@untra sounds good! I didn't want to change default behavior, but I can update to do that if you like. I'm still working through a couple of bugs (notably with x-default). Just for transparency - Claude code has been helping a lot with both tests and finding bugs. |
Co-authored-by: Samuel Volin <untra.sam@gmail.com>
BREAKING CHANGE: Remove configuration options and make better behavior default - Remove `hreflang_fallback` option - now always only generates hreflang tags for languages with actual translations (previously required setting `hreflang_fallback: false`) - Remove `relativize_canonical` option - now always relativizes canonical URLs from external plugins like jekyll-seo-tag (previously required setting `relativize_canonical: true`) These changes improve SEO accuracy out of the box: - hreflang tags only advertise language versions that actually exist - Canonical URLs correctly include language prefix on translated pages - x-default and default language hreflang URLs are preserved as-is Updated README documentation to reflect the new default behavior. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
This ended up being a bigger PR than I expected. I'm currently adding a new option to customize which page should be labeled as canonical. I think if |
When enabled, canonical URLs on fallback pages (pages without actual translations) point to the default language URL instead of the current language URL. This improves SEO by: - Preventing search engines from indexing duplicate fallback content - Consolidating SEO authority to the original content - Signaling which version is the authoritative source The option also excludes canonical URLs from relativization when enabled, ensuring they correctly point to the default language. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Instead of trying to exclude canonical URLs from relativization via regex (which was complex and error-prone), recommend using jekyll-seo-tag's new `canonical=false` option combined with Polyglot's I18n_Headers tag. This provides cleaner separation of concerns: - jekyll-seo-tag handles all SEO tags except canonical - Polyglot's I18n_Headers handles canonical and hreflang tags with proper translation detection Changes: - Remove canonical exclusion from absolute_url_regex - Update README to document jekyll-seo-tag integration - Remove test for canonical regex exclusion Related: jekyll/jekyll-seo-tag#521 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
ok, I think this is good for final review now. Please test it out. I have tested the Beekeeper Studio site build and it works as expected for me! |
|
@untra this is done! It's working great. |
| {% I18n_Headers %} | ||
| ``` | ||
|
|
||
| The `canonical=false` option is available in jekyll-seo-tag v2.9.0+ (see [PR #521](https://github.com/jekyll/jekyll-seo-tag/pull/521)). |
There was a problem hiding this comment.
I see what you're doing here, and I respect it 😁
I will merge this in once jekyll-seo-tag is updated
There was a problem hiding this comment.
@rathboma if you can follow up on jekyll/jekyll-seo-tag#521 and see that merged in, I will approve this PR and the other approved PRs you've made for polyglot will get merged in.
I might punch up the docs and remove the link to that PR from this README.md after the fact. But this feature work depends on the jekyll-seo-tag PR being merged first.
I really appreciate your effort to refine adjustments to both of these projects, so that {% seo canonical=false %} can be used and polyglot brings the page canonical. This is a great feature, but there's a build order to this.
There was a problem hiding this comment.
Heya @rathboma give this PR and jekyll/jekyll-seo-tag#521 another pass, and we can get them merged in for the next polyglot release after this one
|
@untra sorry for the delay! I updated my jekyll-seo-tag PR to fix their feedback, hopefully it should be merged soon. |
Resolve conflicts in README.md and site_spec.rb, keeping both the hreflang fallback docs/tests and the upstream rendered_lang tests and netlify redirects docs.
Fix #281
Overview
This PR improves Polyglot's SEO behavior by making better defaults the standard. These changes are breaking for sites that relied on the previous fallback behavior.
Breaking Changes
hreflang tags now only generated for actual translations
Previously, Polyglot generated
hreflangtags for all configured languages, even when a page fell back to the default language content. Now,hreflangtags are only generated for languages that have actual translations.Before: A page with only English content would get
hreflangtags for all configured languages (en, es, fr, de, etc.)After: The same page only gets
hreflang="en"andhreflang="x-default"New Feature: Fallback Canonical URLs
Added
fallback_canonical_to_default_langoption to control canonical URL behavior for fallback pages:When enabled:
/es/sobre-nosotros/)/about/instead of/es/about/)Recommended: Use with jekyll-seo-tag
For best results with canonical URLs, we recommend using jekyll-seo-tag's
canonical=falseoption combined with Polyglot'sI18n_Headerstag:{% seo canonical=false %} {% I18n_Headers %}This allows Polyglot's
I18n_Headersto handle canonical URLs with proper translation detection, while jekyll-seo-tag handles all other SEO tags.Improvements Included
Extended translation detection to include
site.pagessite.collectionsfor translations with matchingpage_idsite.pages, so standalone pages (not in collections) properly detect translationsPermalink-based translation matching
page_idis not set, falls back to matching translations by permalinkProper handling of pages without
langfrontmatterlangin frontmatter are treated as belonging todefault_langLanguage prefix normalization
/es/about→/about)Fixed
x-defaultURL relativizationx-defaultURLs from being incorrectly rewritten with language prefixesx-defaultnow correctly points to the default language versionExample Results
Page with English and Spanish translations:
Fallback page (no translations) with
fallback_canonical_to_default_lang: true:Type of change
Checklists