mirror of
https://git.sr.ht/~seirdy/seirdy.one
synced 2024-11-10 00:12:09 +00:00
287a0a5dc0
For shitty services that at least respect robots.txt
46 lines
2.1 KiB
Text
46 lines
2.1 KiB
Text
User-agent: *
|
|
Disallow: /noindex/
|
|
Disallow: /misc/
|
|
|
|
# I opt out of online advertising so malware that injects ads on my site won't get paid.
|
|
# You should do the same. my ads.txt file contains a standard placeholder to forbid any
|
|
# compliant ad networks from paying for ad placement on my domain.
|
|
User-Agent: Adsbot
|
|
Disallow: /
|
|
Allow: /ads.txt
|
|
Allow: /app-ads.txt
|
|
|
|
# The next three are borrowed from https://www.videolan.org/robots.txt
|
|
|
|
# > This robot collects content from the Internet for the sole purpose of # helping educational institutions prevent plagiarism. [...] we compare student papers against the content we find on the Internet to see if we # can find similarities. (http://www.turnitin.com/robot/crawlerinfo.html)
|
|
# --> fuck off.
|
|
User-Agent: TurnitinBot
|
|
Disallow: /
|
|
|
|
# > NameProtect engages in crawling activity in search of a wide range of brand and other intellectual property violations that may be of interest to our clients. (http://www.nameprotect.com/botinfo.html)
|
|
# --> fuck off.
|
|
User-Agent: NPBot
|
|
Disallow: /
|
|
|
|
# iThenticate is a new service we have developed to combat the piracy of intellectual property and ensure the originality of written work for# publishers, non-profit agencies, corporations, and newspapers. (http://www.slysearch.com/)
|
|
# --> fuck off.
|
|
User-Agent: SlySearch
|
|
Disallow: /
|
|
|
|
# BLEXBot assists internet marketers to get information on the link structure of sites and their interlinking on the web, to avoid any technical and possible legal issues and improve overall online experience. (http://webmeup-crawler.com/)
|
|
# --> fuck off.
|
|
User-Agent: BLEXBot
|
|
Disallow: /
|
|
|
|
# Providing Intellectual Property professionals with superior brand protection services by artfully merging the latest technology with expert analysis. (https://www.checkmarknetwork.com/spider.html/)
|
|
# "The Internet is just way to big to effectively police alone." (ACTUAL quote)
|
|
# --> fuck off.
|
|
User-agent: CheckMarkNetwork/1.0 (+https://www.checkmarknetwork.com/spider.html)
|
|
Disallow: /
|
|
|
|
# Eat shit, OpenAI.
|
|
User-agent: ChatGPT-User
|
|
Disallow: /
|
|
|
|
|
|
Sitemap: https://seirdy.one/sitemap.xml
|