Seirdy
|
0139e58f87
|
new page: scrapers i block
Also remove the comments in robots.txt in favor of directing people to
that page.
|
2024-12-10 00:37:54 -05:00 |
|
Seirdy
|
c3b3f9a269
|
Opt out of GenAI training on OpenWebSearch.EU
|
2024-12-09 23:14:06 -05:00 |
|
Seirdy
|
1581a1077d
|
block Kangaroo Bot
|
2024-12-06 23:22:48 -05:00 |
|
Seirdy
|
aea602e88a
|
Block webz GenAI scraping
|
2024-10-23 00:09:46 -04:00 |
|
Seirdy
|
100a6f3d11
|
block another LLM scraper
|
2024-09-26 10:47:07 -04:00 |
|
Seirdy
|
1701c4b254
|
Slow down MJ12bot
|
2024-08-08 02:21:00 -04:00 |
|
Seirdy
|
95293d5edb
|
Add usvg to uses page
|
2024-08-01 23:31:07 -04:00 |
|
Seirdy
|
ce1b74f4e2
|
update facebook GenAI crawler in robots.txt
|
2024-07-26 23:03:30 -04:00 |
|
Seirdy
|
ee8a98721b
|
Disallow LLM training by Apple
|
2024-06-25 00:54:04 -04:00 |
|
Seirdy
|
4f28f001bf
|
robots.txt: remove unused anthropic directives
official docs show the right opt-out signal
|
2024-06-01 05:35:15 -04:00 |
|
Rohan Kumar
|
e4e020649d
|
Robots.txt: Add new ClaudeBot UA, formatting
|
2024-05-06 17:44:22 -04:00 |
|
Rohan Kumar
|
247ec11dae
|
Add some more docs to robots.txt
|
2024-03-20 21:34:55 -04:00 |
|
Rohan Kumar
|
619c4ec3f6
|
minor robots.txt refactor + block facebookbot
|
2024-03-13 02:23:28 -04:00 |
|
Rohan Kumar
|
0e89f7f052
|
Update docs in robots.txt
|
2024-03-13 01:14:49 -04:00 |
|
Rohan Kumar
|
1cd7f2c106
|
add some AI scrapers to robots.txt
|
2024-03-12 23:53:58 -04:00 |
|
Rohan Kumar
|
4a11ca9f39
|
opt out of gen-ai training
|
2024-03-12 20:29:15 -04:00 |
|
Rohan Kumar
|
034c6301fc
|
Update robots.txt with OpenAI's new bot
|
2023-08-06 16:54:29 -07:00 |
|
Rohan Kumar
|
c02e2a78a5
|
Add another IP-violation crawler to robots.txt
|
2023-07-24 23:43:13 -07:00 |
|
Rohan Kumar
|
287a0a5dc0
|
More robots.txt exclusions
For shitty services that at least respect robots.txt
|
2023-07-24 15:33:02 -07:00 |
|
Rohan Kumar
|
b858a31f40
|
Fuck off, OpenAI
|
2023-04-07 18:05:37 -07:00 |
|
Rohan Kumar
|
c648324fe6
|
remove redundant robots.txt entry
|
2022-07-24 11:32:16 -07:00 |
|
Rohan Kumar
|
5c5f134c95
|
slightly re-org robots.txt
|
2022-07-13 18:12:47 -07:00 |
|
Rohan Kumar
|
95a1685567
|
Kang VLC's robots.txt commentary
|
2022-06-12 21:52:28 -07:00 |
|
Rohan Kumar
|
3300d14f99
|
robots: disallow some toxic bs
|
2022-04-22 21:45:15 -07:00 |
|
Rohan Kumar
|
e03a35da82
|
Update robots.txt
|
2021-06-11 15:09:43 -07:00 |
|
Rohan Kumar
|
aa5c16bdf3
|
Update robots.txt
|
2021-01-23 12:47:50 -08:00 |
|
rohan kumar
|
638cb80ed3
|
Fix robots.txt
|
2020-12-15 23:14:09 -08:00 |
|
rohan kumar
|
243238be28
|
Fix robots.txt
|
2020-11-30 13:06:44 -08:00 |
|
rohan kumar
|
349ba15f38
|
Add robots.txt
|
2020-11-29 11:37:33 -08:00 |
|