1
0
Fork 0
mirror of https://git.sr.ht/~seirdy/seirdy.one synced 2024-11-27 14:12:09 +00:00

New engines: semantic scholar, SSEL

Add Semantic Scholar and Secret Search Engine Labs.
This commit is contained in:
Rohan Kumar 2022-03-01 00:35:54 -08:00
parent 2bf7359303
commit 028a720186
No known key found for this signature in database
GPG key ID: 1E892DB2A5F84479
2 changed files with 12 additions and 4 deletions

View file

@ -108,21 +108,24 @@ These engines fail badly at a few important tests. Otherwise, they seem to work
* Kozmonavt: Has a small index of almost 5 million sites. If I want to find the website for a certain project, Kozmonavt works well (provided its index has crawled said website). It works poorly for learning things and finding general information. I cannot recommend it for anything serious since it lacks contact information, a privacy policy, or any other information about the org/people who made it. Discovered in the seirdy.one access logs. * Kozmonavt: Has a small index of almost 5 million sites. If I want to find the website for a certain project, Kozmonavt works well (provided its index has crawled said website). It works poorly for learning things and finding general information. I cannot recommend it for anything serious since it lacks contact information, a privacy policy, or any other information about the org/people who made it. Discovered in the seirdy.one access logs.
* Burf.co: Very small index, but seems fine at ranking more relevant results higher. Allows site submission without any extra steps. * Burf.co: Very small index, but seems fine at ranking more relevant results higher. Allows site submission without any extra steps.
* Entfer: a newcomer that lets registered users upvote/downvote search results to customize ranking. Doesn't offer much information on who made it. Its index is small, but it does seem to return results related to the query. * Entfer: a newcomer that lets registered users upvote/downvote search results to customize ranking. Doesn't offer much information on who made it. Its index is small, but it does seem to return results related to the query.
* Siik: Lacks contact info, and the ToS and Privacy Policy links are dead. Seems to have PHP errors in the backend for some of its instant-answer widgets. If you scroll past all that, it does have web results powered by what seems to be its own index. These results do tend to be somewhat relevant, but the index seems too small for more specific queries.
=> https://meorca.com/ Meorca Search Engine => https://meorca.com/ Meorca Search Engine
=> https://alpha.infotiger.com/ Infotiger => https://alpha.infotiger.com/ Infotiger
=> https://kozmonavt.ml/ Kozmonavt => https://kozmonavt.ml/ Kozmonavt
=> https://burf.co/ Burf.co => https://burf.co/ Burf.co
=> https://entfer.com/ Entfer => https://entfer.com/ Entfer
* Siik: Lacks contact info, and the ToS and Privacy Policy links are dead. Seems to have PHP errors in the backend for some of its instant-answer widgets. If you scroll past all that, it does have web results powered by what seems to be its own index. These results do tend to be somewhat relevant, but the index seems too small for more specific queries.
* ChatNoir: An experimental engine by researchers that uses the Common Crawl index. The engine is open source. There's more information in its announcement on the Common Crawl mailing list (Google Groups).
=> https://siik.co/ Siik => https://siik.co/ Siik
* ChatNoir: An experimental engine by researchers that uses the Common Crawl index. The engine is open source. There's more information in its announcement on the Common Crawl mailing list (Google Groups).
* Secret Search Engine Labs: Very small index with very little SEO spam; it toes the line between a "search engine" and a "surf engine". It's best for reading about broad topics that would otherwise be dominated by SEO spam, thanks to its CashRank algorithm. Allows site submission.
=> https://www.chatnoir.eu/ ChatNoir => https://www.chatnoir.eu/ ChatNoir
=> https://commoncrawl.org/ Common Crawl => https://commoncrawl.org/ Common Crawl
=> https://github.com/chatnoir-eu ChatNoir source code (GitHub) => https://github.com/chatnoir-eu ChatNoir source code (GitHub)
=> https://groups.google.com/g/common-crawl/c/3o2dOHpeRxo/m/H2Osqz9dAAAJ ChatNoir Announcement => https://groups.google.com/g/common-crawl/c/3o2dOHpeRxo/m/H2Osqz9dAAAJ ChatNoir Announcement
=> http://www.secretsearchenginelabs.com/ Secret Search Engine Labs
=> http://www.secretsearchenginelabs.com/tech/cashrank.php CashRank Algorithm
### Unusable engines, irrelevant results ### Unusable engines, irrelevant results
@ -200,8 +203,10 @@ These engines try to find a website, typically at the domain-name level. They do
* Quor: seems to mainly index large news sites. Site is down as of June 2021. Originally available at www dot quor dot com. * Quor: seems to mainly index large news sites. Site is down as of June 2021. Originally available at www dot quor dot com.
* Ninfex: a "people-powered" search engine that combines aspects of link aggregators and search. It lets users vote on submissions and it also displays links to forums about submissions. * Ninfex: a "people-powered" search engine that combines aspects of link aggregators and search. It lets users vote on submissions and it also displays links to forums about submissions.
* Semantic Scholar: a search engine by the Allen Institute for AI focused on academic PDFs, with a couple hundred million papers indexed.
=> https://ninfex.com Ninfex => https://ninfex.com Ninfex
=> https://www.semanticscholar.org/ Semantic Scholar
## Other languages ## Other languages

View file

@ -101,6 +101,7 @@ These engines fail badly at a few important tests. Otherwise, they seem to work
- [Entfer](https://entfer.com/): a newcomer that lets registered users upvote/downvote search results to customize ranking. Doesn't offer much information on who made it. Its index is small, but it does seem to return results related to the query. - [Entfer](https://entfer.com/): a newcomer that lets registered users upvote/downvote search results to customize ranking. Doesn't offer much information on who made it. Its index is small, but it does seem to return results related to the query.
- [Siik](https://siik.co/): Lacks contact info, and the ToS and Privacy Policy links are dead. Seems to have PHP errors in the backend for some of its instant-answer widgets. If you scroll past all that, it does have web results powered by what seems to be its own index. These results do tend to be somewhat relevant, but the index seems too small for more specific queries. - [Siik](https://siik.co/): Lacks contact info, and the ToS and Privacy Policy links are dead. Seems to have PHP errors in the backend for some of its instant-answer widgets. If you scroll past all that, it does have web results powered by what seems to be its own index. These results do tend to be somewhat relevant, but the index seems too small for more specific queries.
- [ChatNoir](https://www.chatnoir.eu/): An experimental engine by researchers that uses the [Common Crawl](https://commoncrawl.org/) index. The engine is [open source](https://github.com/chatnoir-eu). See the [announcement](https://groups.google.com/g/common-crawl/c/3o2dOHpeRxo/m/H2Osqz9dAAAJ) on the Common Crawl mailing list (Google Groups). - [ChatNoir](https://www.chatnoir.eu/): An experimental engine by researchers that uses the [Common Crawl](https://commoncrawl.org/) index. The engine is [open source](https://github.com/chatnoir-eu). See the [announcement](https://groups.google.com/g/common-crawl/c/3o2dOHpeRxo/m/H2Osqz9dAAAJ) on the Common Crawl mailing list (Google Groups).
- [Secret Search Engine Labs](http://www.secretsearchenginelabs.com/): Very small index with very little SEO spam; it toes the line between a "search engine" and a "surf engine". It's best for reading about broad topics that would otherwise be dominated by SEO spam, thanks to its [CashRank algorithm](http://www.secretsearchenginelabs.com/tech/cashrank.php). Allows site submission.
### Unusable engines, irrelevant results ### Unusable engines, irrelevant results
@ -147,6 +148,7 @@ These engines try to find a website, typically at the domain-name level. They do
- Quor: Seems to mainly index large news sites. Site is down as of June 2021; originally available at www dot quor dot com. - Quor: Seems to mainly index large news sites. Site is down as of June 2021; originally available at www dot quor dot com.
- [Ninfex](https://ninfex.com/): a "people-powered" search engine that combines aspects of link aggregators and search. It lets users vote on submissions and it also displays links to forums about submissions. - [Ninfex](https://ninfex.com/): a "people-powered" search engine that combines aspects of link aggregators and search. It lets users vote on submissions and it also displays links to forums about submissions.
- [Semantic Scholar](https://www.semanticscholar.org/): a search engine by the Allen Institute for AI focused on academic PDFs, with a couple hundred million papers indexed.
Other languages Other languages
--------------- ---------------
@ -175,6 +177,7 @@ Misc
---- ----
* Ask.com: The site is back. They claim to outsource search results. The results seem similar to Google, Bing, and Yandex; however, I cant pinpoint exactly where their results are coming from. Also, several sites from the "ask.com network" such as directhit.com, info.com, and kensaq.com have uniqe-looking results. * Ask.com: The site is back. They claim to outsource search results. The results seem similar to Google, Bing, and Yandex; however, I cant pinpoint exactly where their results are coming from. Also, several sites from the "ask.com network" such as directhit.com, info.com, and kensaq.com have uniqe-looking results.
- Not evaluated: Apple's search. It's only accessible through a search widget in iOS and macOS and shows very few results. This might change; see the next section. - Not evaluated: Apple's search. It's only accessible through a search widget in iOS and macOS and shows very few results. This might change; see the next section.
- Partially evaluated: [Infinity Search](https://infinitysearch.co): young, small index. It recently split into a paid offering with the main index and [Infinity Decentralized](https://infinitydecentralized.com/), the latter of which allows users to select from community-hosted crawlers. I managed to try it out before it became a paid offering, and it seemed decent; however, I wasn't able to run the tests listed in the "Methodology" section. Allows submitting URLs and sitemaps into a text box, no other work required. - Partially evaluated: [Infinity Search](https://infinitysearch.co): young, small index. It recently split into a paid offering with the main index and [Infinity Decentralized](https://infinitydecentralized.com/), the latter of which allows users to select from community-hosted crawlers. I managed to try it out before it became a paid offering, and it seemed decent; however, I wasn't able to run the tests listed in the "Methodology" section. Allows submitting URLs and sitemaps into a text box, no other work required.