mirror of
https://git.sr.ht/~seirdy/seirdy.one
synced 2024-11-10 00:12:09 +00:00
Compare commits
2 commits
a587e5010c
...
40585964fe
Author | SHA1 | Date | |
---|---|---|---|
|
40585964fe | ||
|
ca8736504a |
3 changed files with 9 additions and 7 deletions
|
@ -56,6 +56,7 @@ These are large engines that pass all my standard tests and more.
|
||||||
* Givero
|
* Givero
|
||||||
* Swisscows
|
* Swisscows
|
||||||
* Fireball
|
* Fireball
|
||||||
|
* Netzzappen
|
||||||
* You.com¹¹
|
* You.com¹¹
|
||||||
* Partially powers MetaGer by default; this can be turned off
|
* Partially powers MetaGer by default; this can be turned off
|
||||||
* At this point, I mostly stopped adding Bing-based search engines. There are just too many.
|
* At this point, I mostly stopped adding Bing-based search engines. There are just too many.
|
||||||
|
@ -78,7 +79,7 @@ Google, Bing, and Yandex support structured data such as microformats1, microdat
|
||||||
|
|
||||||
These engines pass most of the tests listed in the "methodology" section. All of them seem relatively privacy-friendly.
|
These engines pass most of the tests listed in the "methodology" section. All of them seem relatively privacy-friendly.
|
||||||
|
|
||||||
* Right Dao: very fast, good results. Passes the tests fairly well. It plans on including query-based ads if/when its userbase grows.⁸
|
* Right Dao: very fast, good results. Passes the tests fairly well. It plans on including query-based ads if/when its userbase grows.⁸ For the past few months, its index seems to have focused more on large, established sites rather than smaller, independent ones.
|
||||||
|
|
||||||
=> https://rightdao.com Right Dao
|
=> https://rightdao.com Right Dao
|
||||||
|
|
||||||
|
@ -124,11 +125,9 @@ These engines fail badly at a few important tests. Otherwise, they seem to work
|
||||||
=> https://siik.co/ Siik
|
=> https://siik.co/ Siik
|
||||||
=> https://inetdex.com inetdex.com
|
=> https://inetdex.com inetdex.com
|
||||||
|
|
||||||
* Meorca: A UK-based search engine that claims not to "index pornography or illegal content websites". It also features an optional social network ("blog"). Discovered in the seirdy.one access logs.
|
|
||||||
* ChatNoir: An experimental engine by researchers that uses the Common Crawl index. The engine is open source. There's more information in its announcement on the Common Crawl mailing list (Google Groups).
|
* ChatNoir: An experimental engine by researchers that uses the Common Crawl index. The engine is open source. There's more information in its announcement on the Common Crawl mailing list (Google Groups).
|
||||||
* Secret Search Engine Labs: Very small index with very little SEO spam; it toes the line between a "search engine" and a "surf engine". It's best for reading about broad topics that would otherwise be dominated by SEO spam, thanks to its CashRank algorithm. Allows site submission.
|
* Secret Search Engine Labs: Very small index with very little SEO spam; it toes the line between a "search engine" and a "surf engine". It's best for reading about broad topics that would otherwise be dominated by SEO spam, thanks to its CashRank algorithm. Allows site submission.
|
||||||
|
|
||||||
=> https://meorca.com/ Meorca Search Engine
|
|
||||||
=> https://www.chatnoir.eu/ ChatNoir
|
=> https://www.chatnoir.eu/ ChatNoir
|
||||||
=> https://commoncrawl.org/ Common Crawl
|
=> https://commoncrawl.org/ Common Crawl
|
||||||
=> https://github.com/chatnoir-eu ChatNoir source code (GitHub)
|
=> https://github.com/chatnoir-eu ChatNoir source code (GitHub)
|
||||||
|
@ -329,10 +328,12 @@ These engines were originally included in the article, but have since been disco
|
||||||
* gus.guru: the original Gemini search engine. The index doesn't seem to be updated anymore.
|
* gus.guru: the original Gemini search engine. The index doesn't seem to be updated anymore.
|
||||||
* wbsrch: In addition to its generalist search, it also had many other utilities related to domain name statistics. Failed multiple tests. Its index was a bit dated; it had an old backlog of sites it hadn’t finished indexing. It also had several dedicated per-language indexes.
|
* wbsrch: In addition to its generalist search, it also had many other utilities related to domain name statistics. Failed multiple tests. Its index was a bit dated; it had an old backlog of sites it hadn’t finished indexing. It also had several dedicated per-language indexes.
|
||||||
* Gowiki: Very young, small index, but showed promise. I discovered this in the seirdy.one access logs. It was only available in the US. Seems down as of early 2022.
|
* Gowiki: Very young, small index, but showed promise. I discovered this in the seirdy.one access logs. It was only available in the US. Seems down as of early 2022.
|
||||||
|
* Meorca: A UK-based search engine that claims not to "index pornography or illegal content websites". It also features an optional social network ("blog"). Discovered in the seirdy.one access logs.
|
||||||
|
|
||||||
=> gemini://gus.guru/ gus.guru
|
=> gemini://gus.guru/ gus.guru
|
||||||
=> https://xangis.com/the-wbsrch-experiment/ The Wbsrch Experiment
|
=> https://xangis.com/the-wbsrch-experiment/ The Wbsrch Experiment
|
||||||
=> https://gowiki.com Gowiki
|
=> https://gowiki.com Gowiki
|
||||||
|
=> https://web.archive.org/web/20220429143153/https://www.meorca.com/search/ Meorca Search Engine (Wayback Machine snapshot)
|
||||||
|
|
||||||
## Exclusions
|
## Exclusions
|
||||||
|
|
||||||
|
|
|
@ -86,6 +86,7 @@ These are large engines that pass all my standard tests and more.
|
||||||
- Givero
|
- Givero
|
||||||
- Swisscows
|
- Swisscows
|
||||||
- Fireball
|
- Fireball
|
||||||
|
- Netzzappen
|
||||||
- You.com[^6]
|
- You.com[^6]
|
||||||
- Partially powers MetaGer by default; this can be turned off
|
- Partially powers MetaGer by default; this can be turned off
|
||||||
- At this point, I mostly stopped adding Bing-<wbr />based search engines. There are just too many.
|
- At this point, I mostly stopped adding Bing-<wbr />based search engines. There are just too many.
|
||||||
|
@ -136,8 +137,6 @@ These engines fail badly at a few important tests. Otherwise, they seem to work
|
||||||
|
|
||||||
- [websearchengine.org](https://websearchengine.org) and [tuxdex.com](https://tuxdex.com): Both are run by the same people, powered by their [inetdex.com](https://inetdex.com) index. Searches are fast, but crawls are a bit shallow. Claims to have an index of 10 million domains, and not to use cookies.
|
- [websearchengine.org](https://websearchengine.org) and [tuxdex.com](https://tuxdex.com): Both are run by the same people, powered by their [inetdex.com](https://inetdex.com) index. Searches are fast, but crawls are a bit shallow. Claims to have an index of 10 million domains, and not to use cookies.
|
||||||
|
|
||||||
- [Meorca](https://meorca.com/): A UK-based search engine that claims not to "index pornography or illegal content websites". It also features an optional social network ("blog"). Discovered in the seirdy.one access logs.
|
|
||||||
|
|
||||||
- [ChatNoir](https://www.chatnoir.eu/): An experimental engine by researchers that uses the [Common Crawl](https://commoncrawl.org/) index. The engine is [open source](https://github.com/chatnoir-eu). See the [announcement](https://groups.google.com/g/common-crawl/c/3o2dOHpeRxo/m/H2Osqz9dAAAJ) on the Common Crawl mailing list (Google Groups).
|
- [ChatNoir](https://www.chatnoir.eu/): An experimental engine by researchers that uses the [Common Crawl](https://commoncrawl.org/) index. The engine is [open source](https://github.com/chatnoir-eu). See the [announcement](https://groups.google.com/g/common-crawl/c/3o2dOHpeRxo/m/H2Osqz9dAAAJ) on the Common Crawl mailing list (Google Groups).
|
||||||
|
|
||||||
- [Secret Search Engine Labs](http://www.secretsearchenginelabs.com/): Very small index with very little SEO spam; it toes the line between a "search engine" and a "surf engine". It's best for reading about broad topics that would otherwise be dominated by SEO spam, thanks to its [CashRank algorithm](http://www.secretsearchenginelabs.com/tech/cashrank.php). Allows site submission.
|
- [Secret Search Engine Labs](http://www.secretsearchenginelabs.com/): Very small index with very little SEO spam; it toes the line between a "search engine" and a "surf engine". It's best for reading about broad topics that would otherwise be dominated by SEO spam, thanks to its [CashRank algorithm](http://www.secretsearchenginelabs.com/tech/cashrank.php). Allows site submission.
|
||||||
|
@ -298,7 +297,9 @@ These engines were originally included in the article, but have since been disco
|
||||||
|
|
||||||
- [wbsrch](https://wbsrch.com/): In addition to its generalist search, it also had many other utilities related to domain name statistics. Failed multiple tests. Its index was a bit dated; it had an old backlog of sites it hadn't finished indexing. It also had several dedicated per-language indexes.
|
- [wbsrch](https://wbsrch.com/): In addition to its generalist search, it also had many other utilities related to domain name statistics. Failed multiple tests. Its index was a bit dated; it had an old backlog of sites it hadn't finished indexing. It also had several dedicated per-language indexes.
|
||||||
|
|
||||||
- [Gowiki](https://gowiki.com): Very young, small index, but showed promise. I discovered this in the seirdy.one access logs. It was only available in the US. Seems down as of early 2022.
|
- [Gowiki](https://web.archive.org/web/20211226043304/https://www.gowiki.com/): Very young, small index, but showed promise. I discovered this in the seirdy.one access logs. It was only available in the US. Seems down as of early 2022.
|
||||||
|
|
||||||
|
- [Meorca](https://web.archive.org/web/20220429143153/https://www.meorca.com/search/): A UK-based search engine that claims not to "index pornography or illegal content websites". It also features an optional social network ("blog"). Discovered in the seirdy.one access logs.
|
||||||
|
|
||||||
Exclusions
|
Exclusions
|
||||||
----------
|
----------
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
{{- $wbmLinks := (slice "https://si3t.ch/log/2021-04-18-entetes-floc.html" "https://xmpp.org/2021/02/newsletter-02-feburary/" "https://gurlic.com/technology/post/393626430212145157" "https://gurlic.com/technology/post/343249858599059461" "https://www.librepunk.club/@penryn/108411423190214816" "https://benign.town/@josias/108457015755310198") -}}
|
{{- $wbmLinks := (slice "https://si3t.ch/log/2021-04-18-entetes-floc.html" "https://xmpp.org/2021/02/newsletter-02-feburary/" "https://gurlic.com/technology/post/393626430212145157" "https://gurlic.com/technology/post/343249858599059461" "https://www.librepunk.club/@penryn/108411423190214816" "https://benign.town/@josias/108457015755310198" "http://www.tuxmachines.org/node/148146") -}}
|
||||||
<hr />
|
<hr />
|
||||||
<section aria-labelledby="webmentions">
|
<section aria-labelledby="webmentions">
|
||||||
<h2 id="webmentions" tabindex="-1">Web­mentions</h2>
|
<h2 id="webmentions" tabindex="-1">Web­mentions</h2>
|
||||||
|
|
Loading…
Reference in a new issue