Update robots.txt to disallow bots by jetpham · Pull Request #430 · noisebridge/infrastructure

jetpham · 2026-02-16T23:02:44Z

Currently the robots.txt only blocks scraping and indexing of our 86 page.

User-agent: *
Disallow: /wiki/86
Disallow: /86
Disallow: /index.php?page=86
Noindex: /wiki/86
Noindex: /86
Noindex: /index.php?page=86

This PR blocks all user agents by default

User-agent: *
Disallow: /

But still allows some user agents, namely search engines and users of AI to still search the website


User-agent: Googlebot
Allow: /
Disallow: /wiki/Special:
Disallow: /wiki/86
Disallow: /index.php?
Disallow: /api.php

User-agent: Bingbot
Allow: /
Disallow: /wiki/Special:
Disallow: /wiki/86
Disallow: /index.php?
Disallow: /api.php

...

But this still limits their access to our 86 page

SuperQ

~~We intentionally want to allow well-behaved search engines like Google/Bing/etc.~~

~~Blocking all search engines isn't needed or watned.~~

Re-reading this a bit, I'm not sure it's useful to explicitly list UAs like this. Any non-compliant scrape is not going to respect the robots.txt anyway. I think we should stick to a single UA policy.

mcint · 2026-02-17T08:52:40Z

It's better to devote this energy to caddyfile improvements based on declared User Agents. We can rate limit them, preferentially shed their load early, and other behavior additionally based on...IP address range among others.
Check out https://wikipedia.org/robots.txt. Confirm for mediawiki.org too. commons.wikimedia.org.

I don't know if all special should be blocked, does that block search, and I'm not sure about the prefix syntax nor if most parsers respect the one.

Update robots.txt to disallow bots

aa698c8

jetpham requested review from ElanHR, mcint and nthmost February 16, 2026 23:02

jetpham self-assigned this Feb 16, 2026

SuperQ requested changes Feb 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update robots.txt to disallow bots#430

Update robots.txt to disallow bots#430
jetpham wants to merge 1 commit intomasterfrom
jet/robots_txt

jetpham commented Feb 16, 2026

Uh oh!

SuperQ left a comment •

edited

Loading

Uh oh!

mcint commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jetpham commented Feb 16, 2026

Uh oh!

SuperQ left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mcint commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SuperQ left a comment •

edited

Loading