1
0
Fork 0
mirror of https://git.sr.ht/~seirdy/seirdy.one synced 2024-12-26 18:32:10 +00:00
Commit graph

29 commits

Author SHA1 Message Date
Seirdy
0139e58f87
new page: scrapers i block
Also remove the comments in robots.txt in favor of directing people to
that page.
2024-12-10 00:37:54 -05:00
Seirdy
c3b3f9a269
Opt out of GenAI training on OpenWebSearch.EU 2024-12-09 23:14:06 -05:00
Seirdy
1581a1077d
block Kangaroo Bot 2024-12-06 23:22:48 -05:00
Seirdy
aea602e88a
Block webz GenAI scraping 2024-10-23 00:09:46 -04:00
Seirdy
100a6f3d11
block another LLM scraper 2024-09-26 10:47:07 -04:00
Seirdy
1701c4b254
Slow down MJ12bot 2024-08-08 02:21:00 -04:00
Seirdy
95293d5edb
Add usvg to uses page 2024-08-01 23:31:07 -04:00
Seirdy
ce1b74f4e2
update facebook GenAI crawler in robots.txt 2024-07-26 23:03:30 -04:00
Seirdy
ee8a98721b
Disallow LLM training by Apple 2024-06-25 00:54:04 -04:00
Seirdy
4f28f001bf
robots.txt: remove unused anthropic directives
official docs show the right opt-out signal
2024-06-01 05:35:15 -04:00
Rohan Kumar
e4e020649d
Robots.txt: Add new ClaudeBot UA, formatting 2024-05-06 17:44:22 -04:00
Rohan Kumar
247ec11dae
Add some more docs to robots.txt 2024-03-20 21:34:55 -04:00
Rohan Kumar
619c4ec3f6
minor robots.txt refactor + block facebookbot 2024-03-13 02:23:28 -04:00
Rohan Kumar
0e89f7f052
Update docs in robots.txt 2024-03-13 01:14:49 -04:00
Rohan Kumar
1cd7f2c106
add some AI scrapers to robots.txt 2024-03-12 23:53:58 -04:00
Rohan Kumar
4a11ca9f39
opt out of gen-ai training 2024-03-12 20:29:15 -04:00
Rohan Kumar
034c6301fc
Update robots.txt with OpenAI's new bot 2023-08-06 16:54:29 -07:00
Rohan Kumar
c02e2a78a5
Add another IP-violation crawler to robots.txt 2023-07-24 23:43:13 -07:00
Rohan Kumar
287a0a5dc0
More robots.txt exclusions
For shitty services that at least respect robots.txt
2023-07-24 15:33:02 -07:00
Rohan Kumar
b858a31f40
Fuck off, OpenAI 2023-04-07 18:05:37 -07:00
Rohan Kumar
c648324fe6
remove redundant robots.txt entry 2022-07-24 11:32:16 -07:00
Rohan Kumar
5c5f134c95
slightly re-org robots.txt 2022-07-13 18:12:47 -07:00
Rohan Kumar
95a1685567
Kang VLC's robots.txt commentary 2022-06-12 21:52:28 -07:00
Rohan Kumar
3300d14f99
robots: disallow some toxic bs 2022-04-22 21:45:15 -07:00
Rohan Kumar
e03a35da82
Update robots.txt 2021-06-11 15:09:43 -07:00
Rohan Kumar
aa5c16bdf3
Update robots.txt 2021-01-23 12:47:50 -08:00
rohan kumar
638cb80ed3
Fix robots.txt 2020-12-15 23:14:09 -08:00
rohan kumar
243238be28
Fix robots.txt 2020-11-30 13:06:44 -08:00
rohan kumar
349ba15f38
Add robots.txt 2020-11-29 11:37:33 -08:00