1
0
Fork 0
mirror of https://git.sr.ht/~seirdy/seirdy.one synced 2024-11-23 21:02:09 +00:00

New non-generalist search engine: High Browse

This commit is contained in:
Rohan Kumar 2022-12-20 13:50:45 -08:00
parent 5f1cf3b2b2
commit a72a0622ea
No known key found for this signature in database
GPG key ID: 1E892DB2A5F84479
3 changed files with 6 additions and 2 deletions

View file

@ -229,12 +229,14 @@ These engines try to find a website, typically at the domain-name level. They do
### Other ### Other
* High Browse: uses a non-traditional ranking algorithm which does an excellent job of introducing non-SEO-optimized serendipity into search results. One of my favorite "surf-engines", as opposed to traditional "search-engines".
* Keybot: A must-have for anyone who does translation work. It crawls the web looking for multilingual websites. Translators who are unsure about how to translate a given word or phrase can see its usage in two given languages, to learn from other human translators. My parents are fluent English speakers but sometimes struggle to express a given Hindi idiom in English; something like this could be useful to them, since machine translation isn't nuanced enough for every situation. Part of the TTN Translation Network. Discovered in my access logs. * Keybot: A must-have for anyone who does translation work. It crawls the web looking for multilingual websites. Translators who are unsure about how to translate a given word or phrase can see its usage in two given languages, to learn from other human translators. My parents are fluent English speakers but sometimes struggle to express a given Hindi idiom in English; something like this could be useful to them, since machine translation isn't nuanced enough for every situation. Part of the TTN Translation Network. Discovered in my access logs.
* Quor: seems to mainly index large news sites. Site is down as of June 2021. Originally available at www dot quor dot com. * Quor: seems to mainly index large news sites. Site is down as of June 2021. Originally available at www dot quor dot com.
* Semantic Scholar: a search engine by the Allen Institute for AI focused on academic PDFs, with a couple hundred million papers indexed. Discovered in my access logs. * Semantic Scholar: a search engine by the Allen Institute for AI focused on academic PDFs, with a couple hundred million papers indexed. Discovered in my access logs.
* Bonzamate: a search engine specifically for Australian websites. * Bonzamate: a search engine specifically for Australian websites.
* searchcode: A code-search engine by the developer of Bonzamate. Searches a hand-picked list of code forges for source code, supporting many search operators. * searchcode: A code-search engine by the developer of Bonzamate. Searches a hand-picked list of code forges for source code, supporting many search operators.
=> https://highbrow.se/ High Browse
=> https://www.keybot.com/ Keybot Translation Search Machine. => https://www.keybot.com/ Keybot Translation Search Machine.
=> https://www.semanticscholar.org/ Semantic Scholar => https://www.semanticscholar.org/ Semantic Scholar
=> https://bonzamate.com.au/ Bonzamate => https://bonzamate.com.au/ Bonzamate

View file

@ -259,6 +259,8 @@ These engines try to find a website, typically at the domain-name level. They do
### Other ### Other
[High Browse](https://highbrow.se/)
: Uses a non-traditional ranking algorithm which does an excellent job of introducing non-SEO-optimized serendipity into search results. One of my favorite "surf-engines", as opposed to traditional "search-engines".
[Keybot](https://www.keybot.com/) [Keybot](https://www.keybot.com/)
: A must-have for anyone who does translation work. It crawls the web looking for multilingual websites. Translators who are unsure about how to translate a given word or phrase can see its usage in two given languages, to learn from other human translators. My parents are fluent English speakers but sometimes struggle to express a given Hindi idiom in English; something like this could be useful to them, since machine translation isn't nuanced enough for every situation. Part of the [TTN Translation Network](https://www.ttn.ch/). Discovered in my access logs. : A must-have for anyone who does translation work. It crawls the web looking for multilingual websites. Translators who are unsure about how to translate a given word or phrase can see its usage in two given languages, to learn from other human translators. My parents are fluent English speakers but sometimes struggle to express a given Hindi idiom in English; something like this could be useful to them, since machine translation isn't nuanced enough for every situation. Part of the [TTN Translation Network](https://www.ttn.ch/). Discovered in my access logs.

View file

@ -1,4 +1,4 @@
{{- $wbmLinks := (slice "https://si3t.ch/log/2021-04-18-entetes-floc.html" "https://xmpp.org/2021/02/newsletter-02-feburary/" "https://gurlic.com/technology/post/393626430212145157" "https://gurlic.com/technology/post/343249858599059461" "https://www.librepunk.club/@penryn/108411423190214816" "https://benign.town/@josias/108457015755310198" "http://www.tuxmachines.org/node/148146" "https://i.reddit.com/r/web_design/comments/k0dmpj/an_opinionated_list_of_best_practices_for_textual/gdmxy4u/" "https://bbbhltz.space/posts/thoughts-on-tech-feb2021/" "https://jorts.horse/@alice/108477866954580532" "https://brid.gy/comment/mastodon/@Seirdy@pleroma.envs.net/AK7FeQ4h2tUCKNwlXc/AK7GtGkE7JOVgm1Cgi" "https://fosstodon.org/@werwolf/108529382741681838" "https://mastodon.social/@WahbAllat/108986614624476982" "https://linuxrocks.online/@friend/109029028283860044" "https://fosstodon.org/@fullstackthaumaturge/108765040526523487" "https://inhji.de/notes/an-opinionated-list-of-best-practices-for-textual-websites" "https://ravidwivedi.in/whatsapp/" "https://hackers.town/@theruran/108440680870400884" "https://hackers.town/@theruran/108440475936938471" "https://mckinley.cc/twtxt/2022-may-aug.html#2022-06-25T16:06:07-07:00" "https://tarnkappe.info/lesetipps-bayern-it-sicherheit-db-app-trackt-neue-eu-datenbank/" "https://catcatnya.com/@kescher/109221687024062842" "https://catcatnya.com/@kescher/109221707054861018" "https://catcatnya.com/@kescher/109221721385520640" "https://catcatnya.com/@kescher/109221750082044200" "https://brid.gy/post/twitter/seirdy/1536747178877673475" "https://markesler.com/blog/website-design/" "https://catcatnya.com/@kescher/108601418196537980" "https://chaos.social/@n0toose/109035270210401105" "https://nicfab.it/en/posts/aware-digital-communication-respecting-privacy-and-the-apps-or-services-you-choose/" "https://haxf4rall.com/2022/09/23/a-collection-of-articles-about-hardening-linux/" "https://mastodon.randomroad.social/@dctrud/108680634691924661" "https://brid.gy/post/twitter/seirdy/1535891978969174016" "https://mastodon.technology/@codeberg/108403449317373462" "https://harveyr.net/posts/14kb/" "https://brid.gy/comment/mastodon/@Seirdy@pleroma.envs.net/ANUjukccjwEmivz3ia/ANUmmjSDUviUeCz42S") -}} {{- $wbmLinks := (slice "https://si3t.ch/log/2021-04-18-entetes-floc.html" "https://xmpp.org/2021/02/newsletter-02-feburary/" "https://gurlic.com/technology/post/393626430212145157" "https://gurlic.com/technology/post/343249858599059461" "https://www.librepunk.club/@penryn/108411423190214816" "https://benign.town/@josias/108457015755310198" "http://www.tuxmachines.org/node/148146" "https://i.reddit.com/r/web_design/comments/k0dmpj/an_opinionated_list_of_best_practices_for_textual/gdmxy4u/" "https://bbbhltz.space/posts/thoughts-on-tech-feb2021/" "https://jorts.horse/@alice/108477866954580532" "https://brid.gy/comment/mastodon/@Seirdy@pleroma.envs.net/AK7FeQ4h2tUCKNwlXc/AK7GtGkE7JOVgm1Cgi" "https://fosstodon.org/@werwolf/108529382741681838" "https://mastodon.social/@WahbAllat/108986614624476982" "https://linuxrocks.online/@friend/109029028283860044" "https://fosstodon.org/@fullstackthaumaturge/108765040526523487" "https://inhji.de/notes/an-opinionated-list-of-best-practices-for-textual-websites" "https://ravidwivedi.in/whatsapp/" "https://hackers.town/@theruran/108440680870400884" "https://hackers.town/@theruran/108440475936938471" "https://mckinley.cc/twtxt/2022-may-aug.html#2022-06-25T16:06:07-07:00" "https://tarnkappe.info/lesetipps-bayern-it-sicherheit-db-app-trackt-neue-eu-datenbank/" "https://catcatnya.com/@kescher/109221687024062842" "https://catcatnya.com/@kescher/109221707054861018" "https://catcatnya.com/@kescher/109221721385520640" "https://catcatnya.com/@kescher/109221750082044200" "https://brid.gy/post/twitter/seirdy/1536747178877673475" "https://markesler.com/blog/website-design/" "https://catcatnya.com/@kescher/108601418196537980" "https://chaos.social/@n0toose/109035270210401105" "https://nicfab.it/en/posts/aware-digital-communication-respecting-privacy-and-the-apps-or-services-you-choose/" "https://haxf4rall.com/2022/09/23/a-collection-of-articles-about-hardening-linux/" "https://mastodon.randomroad.social/@dctrud/108680634691924661" "https://brid.gy/post/twitter/seirdy/1535891978969174016" "https://mastodon.technology/@codeberg/108403449317373462" "https://harveyr.net/posts/14kb/" "https://brid.gy/comment/mastodon/@Seirdy@pleroma.envs.net/ANUjukccjwEmivz3ia/ANUmmjSDUviUeCz42S" "https://forum.fedeproxy.eu/t/keep-platform-open-article/161" "https://forum.forgefriends.org/t/keep-platform-open-article/161") -}}
{{- $rewritesDict := dict "" "" -}} {{- $rewritesDict := dict "" "" -}}
{{- range $i, $r := (getCSV "," "/csv/rewrites.csv") -}} {{- range $i, $r := (getCSV "," "/csv/rewrites.csv") -}}
{{- $rewritesDict = merge $rewritesDict (dict (index $r 0) (index $r 1)) -}} {{- $rewritesDict = merge $rewritesDict (dict (index $r 0) (index $r 1)) -}}
@ -81,7 +81,7 @@
<a class="u-url" itemprop="url" href="{{ $src }}" rel="nofollow ugc">liked</a> this <a class="u-url" itemprop="url" href="{{ $src }}" rel="nofollow ugc">liked</a> this
{{ else -}} {{ else -}}
{{- if findRE `^https://brid.gy/[^/]*/mastodon` $webmention.source -}} {{- if findRE `^https://brid.gy/[^/]*/mastodon` $webmention.source -}}
{{- $canonicalSrc := replaceRE "https://brid.gy/.*pleroma.envs.net/([^/]*)(.*)?" `https://pleroma.envs.net/notice/$1` $src -}} {{- $canonicalSrc := replaceRE "https://brid.gy/.*mastodon/@Seirdy@pleroma.envs.net/([^/]*)(.*)?" `https://pleroma.envs.net/notice/$1` $src -}}
<a class="u-url" itemprop="url" href="{{ $canonicalSrc }}" rel="nofollow ugc"> <a class="u-url" itemprop="url" href="{{ $canonicalSrc }}" rel="nofollow ugc">
{{- else -}} {{- else -}}
<a class="u-url" itemprop="url" href="{{ $src }}" rel="nofollow ugc"> <a class="u-url" itemprop="url" href="{{ $src }}" rel="nofollow ugc">