1
0
Fork 0
mirror of https://git.sr.ht/~seirdy/seirdy.one synced 2024-12-25 02:02:11 +00:00

Opt out of GenAI training on OpenWebSearch.EU

This commit is contained in:
Seirdy 2024-12-09 23:14:06 -05:00
parent b9536a6a9d
commit c3b3f9a269
No known key found for this signature in database
GPG key ID: 1E892DB2A5F84479

View file

@ -141,9 +141,12 @@ Disallow: /
# Googe used this to train the initial version of Bard (now called Gemini).
# I allow CCBot since its index is also used for upstart/hobbyist search engines
# like Alexandria and for genuinely useful academic work I personally like.
# I allow Owler for similar reasons:
# I allow Owler but disallow its "GenAI" identifier for similar reasons:
# <https://openwebsearch.eu/owler/#owler-opt-out>
# <https://openwebsearch.eu/common-goals-with-common-crawl/>.
User-Agent: GenAI
Disallow: /
# Omgilibot/Omgili is similar to CCBot, except it sells the scrape results.
# I'm not familiar enough with Omgili to make a call here.
# In the long run, my embedded robots meta-tags and headers could cover gen-AI