1
0
Fork 0
mirror of https://git.sr.ht/~seirdy/seirdy.one synced 2024-11-14 01:32:11 +00:00

Update search engines

- Add ChatNoir, Ninfex
- Mark Runnaroo as discontinued
This commit is contained in:
Rohan Kumar 2021-05-29 15:57:12 -07:00
parent 0b72929cfd
commit 02b2345e83
No known key found for this signature in database
GPG key ID: 1E892DB2A5F84479
2 changed files with 14 additions and 3 deletions

View file

@ -39,7 +39,7 @@ These are large engines that pass all the above tests and more.
1. Google: the biggest index. Allows submitting pages and sitemaps for crawling, but requires login. Powers a few other engines: 1. Google: the biggest index. Allows submitting pages and sitemaps for crawling, but requires login. Powers a few other engines:
* Startpage * Startpage
* Runnaroo * (discontinued) Runnaroo
* SAPO (Portuguese interface, can work with English results) * SAPO (Portuguese interface, can work with English results)
2. Bing: the runner-up. Allows submitting pages and sitemaps for crawling, but requires login. Its index powers many other engines: 2. Bing: the runner-up. Allows submitting pages and sitemaps for crawling, but requires login. Its index powers many other engines:
@ -119,6 +119,13 @@ These engines fail badly at a few important tests. Otherwise, they seem to work
=> https://kozmonavt.ml/ Kozmonavt => https://kozmonavt.ml/ Kozmonavt
=> https://burf.co/ Burf.co => https://burf.co/ Burf.co
* ChatNoir: An experimental engine by researchers that uses the Common Crawl index. The engine is open source. There's more information in its announcement on the Common Crawl mailing list (Google Groups).
=> https://www.chatnoir.eu/ ChatNoir
=> https://commoncrawl.org/ Common Crawl
=> https://github.com/chatnoir-eu ChatNoir source code (GitHub)
=> https://groups.google.com/g/common-crawl/c/3o2dOHpeRxo/m/H2Osqz9dAAAJ ChatNoir Announcement
### Unusable engines, irrelevant results ### Unusable engines, irrelevant results
Results from these search engines dont seem at all useful. Results from these search engines dont seem at all useful.
@ -145,11 +152,13 @@ These indexing search engines dont have a Google-like “ask me anything” e
* Wiby: I love this one. It focuses on smaller independent sites that capture the spirit of the “early” web. Its more focused on “discovering” new interesting pages that match a set of keywords than finding a specific resources. I like to think of Wiby as an engine for surfing, not searching. Runnaroo occasionally features a hit from Wiby. If you have a small site or blog that isnt very “commercial”, consider submitting it to the index. * Wiby: I love this one. It focuses on smaller independent sites that capture the spirit of the “early” web. Its more focused on “discovering” new interesting pages that match a set of keywords than finding a specific resources. I like to think of Wiby as an engine for surfing, not searching. Runnaroo occasionally features a hit from Wiby. If you have a small site or blog that isnt very “commercial”, consider submitting it to the index.
* Search My Site: Similar to Wiby, but only indexes user-submitted personal and independent sites. It optionally supports IndieAuth. * Search My Site: Similar to Wiby, but only indexes user-submitted personal and independent sites. It optionally supports IndieAuth.
* Quor: seems to mainly index large news sites. * Quor: seems to mainly index large news sites.
* Ninfex: a "people-powered" search engine that combines aspects of link aggregators and search. It lets users vote on submissions and it also displays links to forums about submissions.
=> https://wiby.me wiby.me => https://wiby.me wiby.me
=> https://wiby.org wiby.org => https://wiby.org wiby.org
=> https://searchmysite.net Search My site => https://searchmysite.net Search My site
=> https://www.quor.com Quor => https://www.quor.com Quor
=> https://ninfex.com Ninfex
## Other languages ## Other languages

View file

@ -53,7 +53,7 @@ These are large engines that pass all the above tests and more.
- Google: the biggest index. Allows submitting pages and sitemaps for crawling, but requires login. Powers a few other engines: - Google: the biggest index. Allows submitting pages and sitemaps for crawling, but requires login. Powers a few other engines:
- Startpage - Startpage
- Runnaroo - (discontinued) Runnaroo
- SAPO (Portuguese interface, can work with English results) - SAPO (Portuguese interface, can work with English results)
- Bing: the runner-up. Allows submitting pages and sitemaps for crawling, but requires login. Its index powers many other engines: - Bing: the runner-up. Allows submitting pages and sitemaps for crawling, but requires login. Its index powers many other engines:
- Yahoo - Yahoo
@ -105,6 +105,7 @@ These engines fail badly at a few important tests. Otherwise, they seem to work
- [search.tl](http://www.search.tl/): Generalist search for one <abbr title="top-level domain">TLD</abbr> at a time (defaults to .com). I'm not sure why you'd want to always limit your searches to a single TLD, but now you can.[^9] There isn't any visible UI for changing the TLD for available results; you need to add/change the `tld` URL parameter. For example, to search .org sites, append `&tld=org` to the URL. It seems to be connected to [Amidalla](http://www.amidalla.de/), but Amidalla doesn't seem to currently be operational. Amidalla allows users to manually add URLs to its index and directory; I have yet to see if doing so impacts search.tl results. - [search.tl](http://www.search.tl/): Generalist search for one <abbr title="top-level domain">TLD</abbr> at a time (defaults to .com). I'm not sure why you'd want to always limit your searches to a single TLD, but now you can.[^9] There isn't any visible UI for changing the TLD for available results; you need to add/change the `tld` URL parameter. For example, to search .org sites, append `&tld=org` to the URL. It seems to be connected to [Amidalla](http://www.amidalla.de/), but Amidalla doesn't seem to currently be operational. Amidalla allows users to manually add URLs to its index and directory; I have yet to see if doing so impacts search.tl results.
- [Kozmonavt](https://kozmonavt.ml/): Has a small index of almost 5 million sites. If I want to find the website for a certain project, Kozmonavt works well (provided its index has crawled said website). It works poorly for learning things and finding general information. I cannot recommend it for anything serious since it lacks contact information, a privacy policy, or any other information about the org/people who made it. Discovered in the seirdy.one access logs. - [Kozmonavt](https://kozmonavt.ml/): Has a small index of almost 5 million sites. If I want to find the website for a certain project, Kozmonavt works well (provided its index has crawled said website). It works poorly for learning things and finding general information. I cannot recommend it for anything serious since it lacks contact information, a privacy policy, or any other information about the org/people who made it. Discovered in the seirdy.one access logs.
- [Burf.co](https://burf.co/): Very small index, but seems fine at ranking more relevant results higher. Allows site submission without any extra steps. - [Burf.co](https://burf.co/): Very small index, but seems fine at ranking more relevant results higher. Allows site submission without any extra steps.
- [ChatNoir](https://www.chatnoir.eu/): An experimental engine by researchers that uses the [Common Crawl](https://commoncrawl.org/) index. The engine is [open source](https://github.com/chatnoir-eu). See the [announcement](https://groups.google.com/g/common-crawl/c/3o2dOHpeRxo/m/H2Osqz9dAAAJ) on the Common Crawl mailing list (Google Groups).
### Unusable engines, irrelevant results ### Unusable engines, irrelevant results
@ -125,7 +126,8 @@ These indexing search engines don't have a Google-like "ask me anything" endgame
- Wiby: [wiby.me](https://wiby.me) and [wiby.org](https://wiby.org): I love this one. It focuses on smaller independent sites that capture the spirit of the "early" web. It's more focused on "discovering" new interesting pages that match a set of keywords than finding a specific resources. I like to think of Wiby as an engine for surfing, not searching. Runnaroo occasionally features a hit from Wiby. If you have a small site or blog that isn't very "commercial", consider submitting it to the index. - Wiby: [wiby.me](https://wiby.me) and [wiby.org](https://wiby.org): I love this one. It focuses on smaller independent sites that capture the spirit of the "early" web. It's more focused on "discovering" new interesting pages that match a set of keywords than finding a specific resources. I like to think of Wiby as an engine for surfing, not searching. Runnaroo occasionally features a hit from Wiby. If you have a small site or blog that isn't very "commercial", consider submitting it to the index.
- [Search My Site](https://searchmysite.net): Similar to Wiby, but only indexes user-submitted personal and independent sites. It optionally supports IndieAuth. - [Search My Site](https://searchmysite.net): Similar to Wiby, but only indexes user-submitted personal and independent sites. It optionally supports IndieAuth.
- [Quor](https://www.quor.com): seems to mainly index large news sites. - [Quor](https://www.quor.com): Seems to mainly index large news sites.
- [Ninfex](https://ninfex.com/): a "people-powered" search engine that combines aspects of link aggregators and search. It lets users vote on submissions and it also displays links to forums about submissions.
Other languages Other languages
--------------- ---------------