mirror of
https://git.sr.ht/~seirdy/seirdy.one
synced 2024-11-14 01:32:11 +00:00
Update search engines
- Add ChatNoir, Ninfex - Mark Runnaroo as discontinued
This commit is contained in:
parent
0b72929cfd
commit
02b2345e83
2 changed files with 14 additions and 3 deletions
|
@ -39,7 +39,7 @@ These are large engines that pass all the above tests and more.
|
||||||
1. Google: the biggest index. Allows submitting pages and sitemaps for crawling, but requires login. Powers a few other engines:
|
1. Google: the biggest index. Allows submitting pages and sitemaps for crawling, but requires login. Powers a few other engines:
|
||||||
|
|
||||||
* Startpage
|
* Startpage
|
||||||
* Runnaroo
|
* (discontinued) Runnaroo
|
||||||
* SAPO (Portuguese interface, can work with English results)
|
* SAPO (Portuguese interface, can work with English results)
|
||||||
|
|
||||||
2. Bing: the runner-up. Allows submitting pages and sitemaps for crawling, but requires login. Its index powers many other engines:
|
2. Bing: the runner-up. Allows submitting pages and sitemaps for crawling, but requires login. Its index powers many other engines:
|
||||||
|
@ -119,6 +119,13 @@ These engines fail badly at a few important tests. Otherwise, they seem to work
|
||||||
=> https://kozmonavt.ml/ Kozmonavt
|
=> https://kozmonavt.ml/ Kozmonavt
|
||||||
=> https://burf.co/ Burf.co
|
=> https://burf.co/ Burf.co
|
||||||
|
|
||||||
|
* ChatNoir: An experimental engine by researchers that uses the Common Crawl index. The engine is open source. There's more information in its announcement on the Common Crawl mailing list (Google Groups).
|
||||||
|
|
||||||
|
=> https://www.chatnoir.eu/ ChatNoir
|
||||||
|
=> https://commoncrawl.org/ Common Crawl
|
||||||
|
=> https://github.com/chatnoir-eu ChatNoir source code (GitHub)
|
||||||
|
=> https://groups.google.com/g/common-crawl/c/3o2dOHpeRxo/m/H2Osqz9dAAAJ ChatNoir Announcement
|
||||||
|
|
||||||
### Unusable engines, irrelevant results
|
### Unusable engines, irrelevant results
|
||||||
|
|
||||||
Results from these search engines don’t seem at all useful.
|
Results from these search engines don’t seem at all useful.
|
||||||
|
@ -145,11 +152,13 @@ These indexing search engines don’t have a Google-like “ask me anything” e
|
||||||
* Wiby: I love this one. It focuses on smaller independent sites that capture the spirit of the “early” web. It’s more focused on “discovering” new interesting pages that match a set of keywords than finding a specific resources. I like to think of Wiby as an engine for surfing, not searching. Runnaroo occasionally features a hit from Wiby. If you have a small site or blog that isn’t very “commercial”, consider submitting it to the index.
|
* Wiby: I love this one. It focuses on smaller independent sites that capture the spirit of the “early” web. It’s more focused on “discovering” new interesting pages that match a set of keywords than finding a specific resources. I like to think of Wiby as an engine for surfing, not searching. Runnaroo occasionally features a hit from Wiby. If you have a small site or blog that isn’t very “commercial”, consider submitting it to the index.
|
||||||
* Search My Site: Similar to Wiby, but only indexes user-submitted personal and independent sites. It optionally supports IndieAuth.
|
* Search My Site: Similar to Wiby, but only indexes user-submitted personal and independent sites. It optionally supports IndieAuth.
|
||||||
* Quor: seems to mainly index large news sites.
|
* Quor: seems to mainly index large news sites.
|
||||||
|
* Ninfex: a "people-powered" search engine that combines aspects of link aggregators and search. It lets users vote on submissions and it also displays links to forums about submissions.
|
||||||
|
|
||||||
=> https://wiby.me wiby.me
|
=> https://wiby.me wiby.me
|
||||||
=> https://wiby.org wiby.org
|
=> https://wiby.org wiby.org
|
||||||
=> https://searchmysite.net Search My site
|
=> https://searchmysite.net Search My site
|
||||||
=> https://www.quor.com Quor
|
=> https://www.quor.com Quor
|
||||||
|
=> https://ninfex.com Ninfex
|
||||||
|
|
||||||
## Other languages
|
## Other languages
|
||||||
|
|
||||||
|
|
|
@ -53,7 +53,7 @@ These are large engines that pass all the above tests and more.
|
||||||
|
|
||||||
- Google: the biggest index. Allows submitting pages and sitemaps for crawling, but requires login. Powers a few other engines:
|
- Google: the biggest index. Allows submitting pages and sitemaps for crawling, but requires login. Powers a few other engines:
|
||||||
- Startpage
|
- Startpage
|
||||||
- Runnaroo
|
- (discontinued) Runnaroo
|
||||||
- SAPO (Portuguese interface, can work with English results)
|
- SAPO (Portuguese interface, can work with English results)
|
||||||
- Bing: the runner-up. Allows submitting pages and sitemaps for crawling, but requires login. Its index powers many other engines:
|
- Bing: the runner-up. Allows submitting pages and sitemaps for crawling, but requires login. Its index powers many other engines:
|
||||||
- Yahoo
|
- Yahoo
|
||||||
|
@ -105,6 +105,7 @@ These engines fail badly at a few important tests. Otherwise, they seem to work
|
||||||
- [search.tl](http://www.search.tl/): Generalist search for one <abbr title="top-level domain">TLD</abbr> at a time (defaults to .com). I'm not sure why you'd want to always limit your searches to a single TLD, but now you can.[^9] There isn't any visible UI for changing the TLD for available results; you need to add/change the `tld` URL parameter. For example, to search .org sites, append `&tld=org` to the URL. It seems to be connected to [Amidalla](http://www.amidalla.de/), but Amidalla doesn't seem to currently be operational. Amidalla allows users to manually add URLs to its index and directory; I have yet to see if doing so impacts search.tl results.
|
- [search.tl](http://www.search.tl/): Generalist search for one <abbr title="top-level domain">TLD</abbr> at a time (defaults to .com). I'm not sure why you'd want to always limit your searches to a single TLD, but now you can.[^9] There isn't any visible UI for changing the TLD for available results; you need to add/change the `tld` URL parameter. For example, to search .org sites, append `&tld=org` to the URL. It seems to be connected to [Amidalla](http://www.amidalla.de/), but Amidalla doesn't seem to currently be operational. Amidalla allows users to manually add URLs to its index and directory; I have yet to see if doing so impacts search.tl results.
|
||||||
- [Kozmonavt](https://kozmonavt.ml/): Has a small index of almost 5 million sites. If I want to find the website for a certain project, Kozmonavt works well (provided its index has crawled said website). It works poorly for learning things and finding general information. I cannot recommend it for anything serious since it lacks contact information, a privacy policy, or any other information about the org/people who made it. Discovered in the seirdy.one access logs.
|
- [Kozmonavt](https://kozmonavt.ml/): Has a small index of almost 5 million sites. If I want to find the website for a certain project, Kozmonavt works well (provided its index has crawled said website). It works poorly for learning things and finding general information. I cannot recommend it for anything serious since it lacks contact information, a privacy policy, or any other information about the org/people who made it. Discovered in the seirdy.one access logs.
|
||||||
- [Burf.co](https://burf.co/): Very small index, but seems fine at ranking more relevant results higher. Allows site submission without any extra steps.
|
- [Burf.co](https://burf.co/): Very small index, but seems fine at ranking more relevant results higher. Allows site submission without any extra steps.
|
||||||
|
- [ChatNoir](https://www.chatnoir.eu/): An experimental engine by researchers that uses the [Common Crawl](https://commoncrawl.org/) index. The engine is [open source](https://github.com/chatnoir-eu). See the [announcement](https://groups.google.com/g/common-crawl/c/3o2dOHpeRxo/m/H2Osqz9dAAAJ) on the Common Crawl mailing list (Google Groups).
|
||||||
|
|
||||||
### Unusable engines, irrelevant results
|
### Unusable engines, irrelevant results
|
||||||
|
|
||||||
|
@ -125,7 +126,8 @@ These indexing search engines don't have a Google-like "ask me anything" endgame
|
||||||
|
|
||||||
- Wiby: [wiby.me](https://wiby.me) and [wiby.org](https://wiby.org): I love this one. It focuses on smaller independent sites that capture the spirit of the "early" web. It's more focused on "discovering" new interesting pages that match a set of keywords than finding a specific resources. I like to think of Wiby as an engine for surfing, not searching. Runnaroo occasionally features a hit from Wiby. If you have a small site or blog that isn't very "commercial", consider submitting it to the index.
|
- Wiby: [wiby.me](https://wiby.me) and [wiby.org](https://wiby.org): I love this one. It focuses on smaller independent sites that capture the spirit of the "early" web. It's more focused on "discovering" new interesting pages that match a set of keywords than finding a specific resources. I like to think of Wiby as an engine for surfing, not searching. Runnaroo occasionally features a hit from Wiby. If you have a small site or blog that isn't very "commercial", consider submitting it to the index.
|
||||||
- [Search My Site](https://searchmysite.net): Similar to Wiby, but only indexes user-submitted personal and independent sites. It optionally supports IndieAuth.
|
- [Search My Site](https://searchmysite.net): Similar to Wiby, but only indexes user-submitted personal and independent sites. It optionally supports IndieAuth.
|
||||||
- [Quor](https://www.quor.com): seems to mainly index large news sites.
|
- [Quor](https://www.quor.com): Seems to mainly index large news sites.
|
||||||
|
- [Ninfex](https://ninfex.com/): a "people-powered" search engine that combines aspects of link aggregators and search. It lets users vote on submissions and it also displays links to forums about submissions.
|
||||||
|
|
||||||
Other languages
|
Other languages
|
||||||
---------------
|
---------------
|
||||||
|
|
Loading…
Reference in a new issue