1
0
Fork 0
mirror of https://git.sr.ht/~seirdy/seirdy.one synced 2024-11-10 00:12:09 +00:00

add some AI scrapers to robots.txt

This commit is contained in:
Rohan Kumar 2024-03-12 23:53:58 -04:00
parent b1cc2f135d
commit 1cd7f2c106
No known key found for this signature in database
GPG key ID: 1E892DB2A5F84479

View file

@ -53,4 +53,14 @@ Disallow: /
User-agent: Google-Extended User-agent: Google-Extended
Disallow: / Disallow: /
# There isn't any public documentation for this AFAICT, but Reuters thinks this works so I might as well give it a shot.
User-agent: anthropic-ai
Disallow: /
User-agent: Claude-Web
Disallow: /
# I'm not blocking CCBot for now, since it's also used for upstart/hobbyist search engines like Alexandria and for genuinely useful academic work I personally like. I'm hoping my embedded robots meta-tags and headers will cover gen-AI opt-outs instead.
# Omgilibot/Omgili is similar to CCBot, except it sells the scrape results. I'm not familiar enough to make a call here.
Sitemap: https://seirdy.one/sitemap.xml Sitemap: https://seirdy.one/sitemap.xml