mirror of
https://git.sr.ht/~seirdy/seirdy.one
synced 2024-11-23 21:02:09 +00:00
Compare commits
No commits in common. "734fc16df4b244a8d0a730f61037bf91cd67e01d" and "840bf38de6bad0efa823806677a48e2f03702564" have entirely different histories.
734fc16df4
...
840bf38de6
3 changed files with 1 additions and 36 deletions
|
@ -1,19 +0,0 @@
|
|||
---
|
||||
title: "DuckDuckGo and Bing"
|
||||
date: 2022-06-02T20:59:38-07:00
|
||||
replyURI: "https://www.librepunk.club/@penryn/108411423190214816"
|
||||
replyTitle: "how would html.duckduckgo.com fit into this?"
|
||||
replyType: "SocialMediaPosting"
|
||||
replyAuthor: "@penryn@www.librepunk.club"
|
||||
replyAuthorURI: "https://www.librepunk.club/@penryn"
|
||||
---
|
||||
|
||||
I was referring to crawlers that build indexes for search engines to use. DuckDuckGo does have a crawler---DuckDuckBot---but it's only used for fetching favicons and scraping certain sites for infoboxes ("instant answers", the fancy widgets next to/above the classic link results).
|
||||
|
||||
DuckDuckGo and other engines that use Bing's commercial API have contractual arrangements that typically include a clause that says something like "don't you dare change our results, we don't want to create a competitor to Bing that has better results than us)". Very few companies manage to negotiate an exception; DuckDuckGo is not one of those companies, to my knowledge.
|
||||
|
||||
So to answer your question: it's irrelevant. "html.duckduckgo.com" is a JS-free front-end to DuckDuckGo's backend, and mostly serves as a proxy to Bing results.
|
||||
|
||||
For the record, Google isn't any different when it comes to their API. That's why Ixquick shut down and pivoted to Startpage; Google wasn't happy with Ixquick integrating multiple sources.
|
||||
|
||||
[More info on search engines](https://seirdy.one/posts/2021/03/10/search-engines-with-own-indexes/).
|
|
@ -1,15 +0,0 @@
|
|||
---
|
||||
title: "Opt in telemetry"
|
||||
date: 2022-06-03T02:27:05-07:00
|
||||
replyURI: "https://news.ycombinator.com/item?id=31604932"
|
||||
replyTitle: "As far as I am concerned, telemetry is a good thing"
|
||||
replyType: "SocialMediaPosting"
|
||||
replyAuthor: "eterevsky"
|
||||
replyAuthorURI: "https://news.ycombinator.com/user?id=eterevsky"
|
||||
---
|
||||
Being enrolled in a study should require prior informed consent. Terms of the data collection, including what data can be collected and how that data will be used, must be presented to all participants in language they can understand. Only then can they provide informed consent.
|
||||
|
||||
Harvesting data without permission is just exploitation. Software improvements and user engagement are not more important than basic respect for user agency.
|
||||
|
||||
Moreover, not everyone is like you. People who do have reason to care about data collection should not have their critical needs outweighed for the mere convenience of the majority. This type of rhetoric is often used to dismiss accessibility concerns, which is why we have to turn to legislation.
|
||||
|
|
@ -31,8 +31,7 @@ sed 7d "$html_file" | xmllint --format --encode UTF-8 --noent - -o "$tmp_file"
|
|||
tail -n +8 "$tmp_file" \
|
||||
| sd '<pre(?: tabindex="0")?>\n\t*<code ' '<pre tabindex="0"><code ' \
|
||||
| sd '(?:\n)?</code>\n(?:[\t\s]*)?</pre>' '</code></pre>' \
|
||||
| sd '</span>.span itemprop="familyName"' '</span> <span itemprop="familyName"' \
|
||||
| sd '([a-z])<(data|time)' '$1 <$2'
|
||||
| sd '</span>.span itemprop="familyName"' '</span> <span itemprop="familyName"'
|
||||
} >>"$xhtml_file"
|
||||
|
||||
# replace the html file with the formatted xhtml5 file, excluding the xml declaration
|
||||
|
|
Loading…
Reference in a new issue