1
0
Fork 0
mirror of https://git.sr.ht/~seirdy/seirdy.one synced 2024-11-23 21:02:09 +00:00

Compare commits

..

4 commits

Author SHA1 Message Date
Rohan Kumar
afcc4fd760
Stricter feed validation false-positive filtering 2023-08-27 14:31:42 -07:00
Rohan Kumar
838d73ed22
Reduce zopfli level, switch from jq to jaq
jaq is faster and more strict which is exactly what I want. zopfli level
9 is overkill and takes almost twice as long for barely any difference.
2023-08-27 14:31:32 -07:00
Rohan Kumar
64914b284c
Add receipts for an instance 2023-08-27 14:09:00 -07:00
Rohan Kumar
b95468a93a
punctuation and syndication 2023-08-26 15:06:35 -07:00
5 changed files with 23 additions and 26 deletions

View file

@ -16,8 +16,9 @@ OUTPUT_DIR = public
SSHFLAGS = -o KexAlgorithms=sntrup761x25519-sha512@openssh.com
RSYNCFLAGS += -rlpcv --zc=zstd --zl=6 --skip-compress=gz/br/zst/png/webp/jpg/avif/jxl/mp4/mkv/webm/opus/mp3/gif/ico -e "ssh $(SSHFLAGS)" --chmod=D755,F644
RSYNCFLAGS_EXTRA ?=
# compression gets slow for extreme levels like the old "70109"
ECT_LEVEL=9
# compression gets slow for extreme levels like the old "70109".
# Diminishing returns after level 6; sometimes even larger files.
ECT_LEVEL=6
csv/webrings.csv:
sh scripts/populate-webrings.sh
@ -31,14 +32,14 @@ hugo: csv/webrings.csv $(SRCFILES)
# .hintrc-local for linting local files
# same as regular .hintrc but with a different connector.
.hintrc-local: .hintrc
jq --tab '.connector .name = "local" | del(.connector .options)' <linter-configs/hintrc >.hintrc-local
jaq --tab '.connector .name = "local" | del(.connector .options)' <linter-configs/hintrc >.hintrc-local
.hintrc-devserver: .hintrc
jq --tab '.extends = ["development"] | .hints["http-compression","https-only","ssllabs","sri"] = "off"' <linter-configs/hintrc >.hintrc-devserver
jaq --tab '.extends = ["development"] | .hints["http-compression","https-only","ssllabs","sri"] = "off"' <linter-configs/hintrc >.hintrc-devserver
.PHONY: clean
clean:
rm -rf $(OUTPUT_DIR) .lighthouseci lighthouse-reports mentions.json data/webmentions.json
rm -rf $(OUTPUT_DIR) .lighthouseci lighthouse-reports
.PHONY: lint-css
lint-css: $(CSS_DIR)/*.css
@ -56,8 +57,8 @@ equal-access:
.PHONY: validate-json
validate-json:
jq -reM '""' $(OUTPUT_DIR)/manifest.min.*.webmanifest 1>/dev/null
jq -reM '""' $(OUTPUT_DIR)/webfinger.json 1>/dev/null
jaq -re '""' $(OUTPUT_DIR)/manifest.min.*.webmanifest 1>/dev/null
jaq -re '""' $(OUTPUT_DIR)/webfinger.json 1>/dev/null
.PHONY: validate-html
validate-html:
@ -117,15 +118,6 @@ xhtmlize:
copy-to-xhtml:
find $(OUTPUT_DIR) -type f -name "*.html" -exec sh scripts/copy-file-to-xhtml.sh {} \;
# save webmentions to a file, don't send yet
mentions.json: hugo
# gather old version of the site
# rsync $(RSYNCFLAGS) --exclude '*.gz' --exclude '*.br' --exclude '*.png' --exclude-from .rsyncignore $(WWW_RSYNC_DEST)/ old
static-webmentions -f mentions.json.unfiltered find
# filter the webmentions a bit; jq offers more flexibility than config.toml
jq '[ .[] | select(.Dest|test("https://(git.sr.ht/~seirdy/seirdy.one/log/master|seirdy.one|web.archive.org|archive.is|en.wikipedia.org|matrix.to|([a-z]*.)?reddit.com|github.com)") | not) ]' <mentions.json.unfiltered >mentions.json
rm mentions.json.unfiltered
.PHONY: deploy-html
deploy-html:
rsync $(RSYNCFLAGS) $(RSYNCFLAGS_EXTRA) --exclude 'gemini' --exclude '*.gmi' --exclude-from .rsyncignore $(OUTPUT_DIR)/ $(WWW_RSYNC_DEST) --delete

View file

@ -6,11 +6,11 @@ replyTitle: "“regular” expressions"
replyType: "SocialMediaPosting"
replyAuthor: "Chjara"
replyAuthorURI: "https://tuxcrafting.online/"
#syndicatedCopies:
# - title: 'The Fediverse'
# url: ''
syndicatedCopies:
- title: 'The Fediverse'
url: 'https://pleroma.envs.net/notice/AZ8TzJQpYkHFYzw0CO'
---
De-facto standard extensions for recursion and variable-length look-arounds have existed for ages; the word "regular" in most regular-expression engines is there for historical reasons. I first read about this in {{<mention-work itemtype="TechArticle">}}{{<cited-work name="Apocalypse 5: Pattern Matching" extraName="headline" url="https://raku.org/archive/doc/design/apo/A05.html">}} by {{<indieweb-person itemprop="author" first-name="Larry" last-name="Wall" url="http://www.wall.org/~larry/">}}{{</mention-work>}} (he loves his biblical terminology).
I _would_ like to just use Raku rules for a concise way to describe more advanced grammars; I'd then just keep my regexes to the PCRE subset that's common between Google's RE2 and the Rust regex crate; I doubt they're both "regular" but both guarantee linear time matching. Part of the reason I don't do this is portability. Not everything runs Raku, but almost every platform has [a regex engine with the features I need](https://en.wikipedia.org/wiki/Comparison_of_regular_expression_engines).
I _would_ like to just use Raku rules for a concise way to describe more advanced grammars; I'd then just keep my regexes to the PCRE subset that's common between Google's RE2 and the Rust regex crate. I doubt they're both "regular" but both guarantee linear time matching. Part of the reason I don't do this is portability. Not everything runs Raku, but almost every platform has [a regex engine with the features I need](https://en.wikipedia.org/wiki/Comparison_of_regular_expression_engines).

View file

@ -649,6 +649,12 @@ whinge.town OR whinge.house
: [Ableist misogyny](https://web.archive.org/web/20230806223813/https://whinge.town/notice/AXq0Y0n2HDzVWsmOMi).
: Both instances were set up by the same user ([confirmation 1](https://web.archive.org/web/20230806224027/https://whinge.house/notice/ARzsikv2kOK3V5JaWO), [confirmation 2](https://web.archive.org/web/20230806224027/https://freespeechextremist.com/notice/ARzRYo4L2sPkJv1E2a)), who mains on the former.
wideboys.org
: There used to be an instance on the "social" subdomain, but it shut down. However, there is still a WriteFreely instance on the "blog" subdomain. The instance on the "social" subdomain has been mostly superseded by beefyboys.win.
: On the root domain is [a wiki describing how this domain is affiliated with beefyboys.win](https://web.archive.org/web/20230827195937/https://wideboys.org/BEEFYBOYS.WIN). The [beefyboys.win "about" page](https://web.archive.org/web/20230827200822/https://beefyboys.win/about) confirms this.
: Since beefyboys.win is on FediNuke and wideboys.org is part of the same network with staff and member overlap, and wideboys.org still federates on the "blog" subdomain, it's on the list too. But since it only federates via WriteFreely at the time of writing, it looks like a smaller harassment vector so it's demoted to my tier-0 list.
{{</ nofollow >}}
</details>

View file

@ -52,16 +52,15 @@ run_validator() {
}
validate_feed() {
# silence "self reference doesn't match" because i'm testing a localhost copy.
# entries with the same timestamp isn't a big deal
# unregistered link relationship is a false positive caused by an unknown namespace (rel-mentioned).
rel_mention_string="Unregistered link relationship \($rel_mention_count occurrence"
if [ "$rel_mention_count" = '1' ]; then
rel_mention_string="Unregistered link relationship"
fi
# silence "self reference doesn't match" because i'm testing a localhost copy.
# 'should not contian" has a false positive triggered by ARIA
# entries with the same timestamp isn't a big deal
# unregistered link relationship is a false positive caused by an unknown namespace.
full_regex="Use of unknown namespace|Self reference doesn't match|$rel_mention_string|entries with the same value|Validating $url"
full_regex="Use of unknown namespace|Self reference doesn't match|$rel_mention_string|entries with the same value for atom:updated|Validating $url"
run_validator \
| grep -Ev "$full_regex"

View file

@ -3,7 +3,7 @@
set -e -u
root_dir="$(dirname "$0")/.."
vnu_output="$(jq --from-file "$root_dir/linter-configs/vnu_filter.jq")"
vnu_output="$(jaq --from-file "$root_dir/linter-configs/vnu_filter.jq")"
if [ "$vnu_output" = '' ]; then
echo "All markup is valid"
else