You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The reason will be displayed to describe this comment to others. Learn more.
We intentionally want to allow well-behaved search engines like Google/Bing/etc.
Blocking all search engines isn't needed or watned.
Re-reading this a bit, I'm not sure it's useful to explicitly list UAs like this. Any non-compliant scrape is not going to respect the robots.txt anyway. I think we should stick to a single UA policy.
It's better to devote this energy to caddyfile improvements based on declared User Agents. We can rate limit them, preferentially shed their load early, and other behavior additionally based on...IP address range among others.
I don't know if all special should be blocked, does that block search, and I'm not sure about the prefix syntax nor if most parsers respect the one.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Currently the robots.txt only blocks scraping and indexing of our 86 page.
This PR blocks all user agents by default
But still allows some user agents, namely search engines and users of AI to still search the website
But this still limits their access to our 86 page