mirror of
https://git.sr.ht/~seirdy/seirdy.one
synced 2024-11-23 21:02:09 +00:00
Compare commits
No commits in common. "b061e1b63fed746588ef00acd6d8cfb31c2ca0be" and "ae3e26e5928a9f0f48eabda959d4012eead0f81a" have entirely different histories.
b061e1b63f
...
ae3e26e592
3 changed files with 5 additions and 11 deletions
|
@ -60,8 +60,7 @@ These are large engines that pass all my standard tests and more.
|
|||
3. Yandex: originally a Russian search engine, it now has an English version. Some Russian results bleed into its English site. Like Bing, it allows submitting pages and sitemaps for crawling using the IndexNow API. Powers:
|
||||
|
||||
* Epic Search (went paid-only by June 2021)
|
||||
* Occasionally powers DuckDuckGo’s link results instead of Bing. (update: DuckDuckGo has "paused" its partnership with Yandex)
|
||||
* Petal for Russian users only.
|
||||
* Occasionally powers DuckDuckGo’s link results instead of Bing.
|
||||
|
||||
4. Mojeek: Seems privacy-oriented with a large index containing billions of pages. Quality isn’t at Google/Bing/Yandex’s level, but it’s not bad either. If I had to use Mojeek as my default general search engine, I’d live. Partially powers eTools.ch. At this moment, I think that Mojeek is the best alternative to GBY for general web search.
|
||||
|
||||
|
@ -223,14 +222,12 @@ These engines try to find a website, typically at the domain-name level. They do
|
|||
* Ninfex: a "people-powered" search engine that combines aspects of link aggregators and search. It lets users vote on submissions and it also displays links to forums about submissions.
|
||||
* Semantic Scholar: a search engine by the Allen Institute for AI focused on academic PDFs, with a couple hundred million papers indexed. Discovered in my access logs.
|
||||
* Bonzamate: a search engine specifically for Australian websites.
|
||||
* searchcode: A code-search engine by the developer of Bonzamate. Searches a hand-picked list of code forges for source code, supporting many search operators.
|
||||
|
||||
=> https://www.keybot.com/ Keybot Translation Search Machine.
|
||||
=> https://ninfex.com Ninfex
|
||||
=> https://www.semanticscholar.org/ Semantic Scholar
|
||||
=> https://bonzamate.com.au/ Bonzamate
|
||||
=> https://boyter.org/posts/abusing-aws-to-make-a-search-engine/ Blog post about Bonzamate: "Abuzing AWS to make a search engine".
|
||||
=> https://searchcode.com/ searchcode
|
||||
|
||||
## Other languages
|
||||
|
||||
|
@ -438,7 +435,7 @@ He also gave me some useful details about Seznam, Naver, Baidu, and Goo:
|
|||
|
||||
² Matt from Gigablast told me that indexing YouTube or LinkedIn will get you blocked if you aren't Google or Microsoft. I imagine that you could do so by getting special permission if you're a megacorporation.
|
||||
|
||||
³ DuckDuckGo has a crawler called DuckDuckBot. This crawler doesn't impact the linked results displayed; it just grabs favicons and scrapes data for a few instant answers. DuckDuckGo's help pages claim that the engine uses over 400 sources; my interpretation is that at least 398 sources don't impact organic results. I don't think DuckDuckGo is transparent enough about the fact that their organic results are proxied. Compare DuckDuckGo side-by-side with Bing and you'll see it's sourcing organic results from one of them (probably Bing). Update 2022: DuckDuckGo has the ability to downrank results on its own; it was previously working with Bing to get Bing to remove misinformation and spam:
|
||||
³ DuckDuckGo has a crawler called DuckDuckBot. This crawler doesn't impact the linked results displayed; it just grabs favicons and scrapes data for a few instant answers. DuckDuckGo's help pages claim that the engine uses over 400 sources; my interpretation is that at least 398 sources don't impact organic results. I don't think DuckDuckGo is transparent enough about the fact that their organic results are proxied. Compare DuckDuckGo side-by-side with Bing and Yandex and you'll see it's sourcing organic results from one of them (probably Bing). Update 2022: DuckDuckGo has the ability to downrank results on its own; it was previously working with Bing to get Bing to remove misinformation and spam:
|
||||
|
||||
=> https://web.archive.org/web/20220310222014/https://nitter.pussthecat.org/yegg/status/1501716484761997318 Gabriel Weinberg on Twitter
|
||||
=> https://www.nytimes.com/2022/02/23/technology/duckduckgo-conspiracy-theories.html DuckDuckGo's prior approach to moderation
|
||||
|
|
|
@ -89,8 +89,7 @@ These are large engines that pass all my standard tests and more.
|
|||
- Yandex: originally a Russian search engine, it now has an English version. Some Russian results bleed into its English site. Like Bing, it allows submitting pages and sitemaps for crawling using the IndexNow API. Powers:
|
||||
|
||||
- Epic Search (went paid-only as of June 2021)
|
||||
- Occasionally powers DuckDuck­Go's link results instead of Bing <ins cite="https://energycommerce.house.gov/committee-activity/hearings/hearing-on-holding-big-tech-accountable-legislation-to-protect-online">(update: DuckDuckGo has "paused" its partnership with Yandex, confirmed in {{<mention-work itemtype="Event" itemprop="mentions" role="doc-credit">}}{{<cited-work name="Hearing on “Holding Big Tech Accountable: Legislation to Protect Online Users”" url="https://energycommerce.house.gov/committee-activity/hearings/hearing-on-holding-big-tech-accountable-legislation-to-protect-online" >}}{{</mention-work>}})</ins>
|
||||
- Petal, for Russian users only.
|
||||
- Occasionally powers DuckDuck­Go's link results instead of Bing.
|
||||
|
||||
- [Mojeek](https://www.mojeek.com/): Seems privacy-oriented with a large index containing billions of pages. Quality isn't at GBY's level, but it’s not bad either. If I had to use Mojeek as my default general search engine, I'd live. Partially powers [eTools.ch](https://www.etools.ch/). At this moment, _I think that Mojeek is the best alternative to GBY_ for general search.
|
||||
|
||||
|
@ -213,8 +212,6 @@ These engines try to find a website, typically at the domain-name level. They do
|
|||
|
||||
- [Bonzamate](https://bonzamate.com.au/): a search engine specifically for Australian websites. Boyter wrote [an interesting blog post about Bonzamate](https://boyter.org/posts/abusing-aws-to-make-a-search-engine/).
|
||||
|
||||
- [searchcode](https://searchcode.com/): A code-search engine by the developer of Bonzamate. Searches a hand-picked list of code forges for source code, supporting many search operators.
|
||||
|
||||
Other languages
|
||||
---------------
|
||||
|
||||
|
@ -379,7 +376,7 @@ When building webpages, authors need to consider the barriers to entry for a new
|
|||
Try a "bad" engine from lower in the list. It might show you utter crap. But every garbage heap has an undiscovered treasure. I'm sure that some hidden gems you'll find will be worth your while. Let's add some serendipity to the SEO-filled Web.
|
||||
|
||||
Acknow­ledgements {#acknowledgements}
|
||||
---------------------
|
||||
-------------------------------
|
||||
|
||||
Some of this content came from the [Search Engine Map](https://www.searchenginemap.com/) and [Search Engine Party](https://searchengine.party/). A few web directories also proved useful.
|
||||
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
DirectoryPath: "public"
|
||||
IgnoreDirs:
|
||||
- "search"
|
||||
CacheExpires: "96h" # four days
|
||||
CacheExpires: "72h" # three days
|
||||
CheckFavicon: true
|
||||
EnforceHTML5: true
|
||||
IgnoreAltMissing: true # an empty alt makes presentation-role explicit, it's not a defect.
|
||||
|
|
Loading…
Reference in a new issue