From 5c5f134c9582641d4196d43ef938855faa39be2c Mon Sep 17 00:00:00 2001 From: Rohan Kumar Date: Wed, 13 Jul 2022 18:12:47 -0700 Subject: [PATCH] slightly re-org robots.txt --- static/robots.txt | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/static/robots.txt b/static/robots.txt index 3e343e0..18f53c6 100644 --- a/static/robots.txt +++ b/static/robots.txt @@ -1,34 +1,36 @@ User-agent: * Disallow: /noindex/ Disallow: /misc/ -Disallow: /webmentions/ -# "This robot collects content from the Internet for the sole purpose of # helping educational institutions prevent plagiarism. [...] we compare # student papers against the content we find on the Internet to see if we # can find similarities." (http://www.turnitin.com/robot/crawlerinfo.html) +# I opt out of online advertising so malware that injects ads on my site won't get paid. +# You should do the same. +User-Agent: Adsbot +Disallow: / +Allow: /ads.txt + +# the UA should not be case-sensitive, but I gotta cover my bases. +User-Agent: AdsBot +Disallow: / +Allow: /ads.txt + +# > This robot collects content from the Internet for the sole purpose of # helping educational institutions prevent plagiarism. [...] we compare student papers against the content we find on the Internet to see if we # can find similarities. (http://www.turnitin.com/robot/crawlerinfo.html) # --> fuck off. User-Agent: TurnitinBot Disallow: / -# "NameProtect engages in crawling activity in search of a wide range of -# brand and other intellectual property violations that may be of interest -# to our clients." (http://www.nameprotect.com/botinfo.html) +# > NameProtect engages in crawling activity in search of a wide range of brand and other intellectual property violations that may be of interest to our clients. (http://www.nameprotect.com/botinfo.html) # --> fuck off. User-Agent: NPBot Disallow: / -# "iThenticate is a new service we have developed to combat the piracy of intellectual property and ensure the originality of written work for# publishers, non-profit agencies, corporations, and newspapers." (http://www.slysearch.com/) +# iThenticate is a new service we have developed to combat the piracy of intellectual property and ensure the originality of written work for# publishers, non-profit agencies, corporations, and newspapers. (http://www.slysearch.com/) # --> fuck off. User-Agent: SlySearch Disallow: / -# "BLEXBot assists internet marketers to get information on the link structure of sites and their interlinking on the web, to avoid any technical and possible legal issues and improve overall online experience." (http://webmeup-crawler.com/) +# BLEXBot assists internet marketers to get information on the link structure of sites and their interlinking on the web, to avoid any technical and possible legal issues and improve overall online experience. (http://webmeup-crawler.com/) # --> fuck off. User-Agent: BLEXBot -Dissalow: / - -User-Agent: Adsbot -Disallow: / - -User-Agent: AdsBot Disallow: / Sitemap: https://seirdy.one/sitemap.xml