mirror of
https://git.sr.ht/~seirdy/seirdy.one
synced 2024-11-23 12:52:10 +00:00
update facebook GenAI crawler in robots.txt
This commit is contained in:
parent
bf75e3a3d0
commit
ce1b74f4e2
1 changed files with 3 additions and 1 deletions
|
@ -110,8 +110,10 @@ Disallow: /
|
||||||
|
|
||||||
# FacebookBot crawls public web pages to improve language models for our speech
|
# FacebookBot crawls public web pages to improve language models for our speech
|
||||||
# recognition technology.
|
# recognition technology.
|
||||||
|
# UPDATE 2024-07: The Meta-ExternalAgent crawler crawls the web for use cases such as training AI models or improving products by indexing content directly.
|
||||||
# <https://developers.facebook.com/docs/sharing/bot/?_fb_noscript=1>
|
# <https://developers.facebook.com/docs/sharing/bot/?_fb_noscript=1>
|
||||||
User-Agent: FacebookBot
|
User-Agent: FacebookBot
|
||||||
|
User-Agent: meta-externalagent
|
||||||
Disallow: /
|
Disallow: /
|
||||||
|
|
||||||
# I'm not blocking CCBot for now. It publishes a free index for anyone to use.
|
# I'm not blocking CCBot for now. It publishes a free index for anyone to use.
|
||||||
|
|
Loading…
Reference in a new issue