The following applies to minimal websites that focus primarily on text. It does not apply to websites that have a lot of non-textual content. It also does not apply to websites that focus more on generating revenue or pleasing investors than being inclusive. This is a "living document" that I add to as I receive feedback. => https://git.sr.ht/~seirdy/seirdy.one/log/master/item/content/posts/website-best-practices.gmi See the changelog This is also a somewhat long read; for a summary, skip everything between the table of contents and the conclusion. I realize not everybody's going to ditch the Web and switch to Gemini or Gopher today (that'll take, like, a month at the longest). Until that happens, here's a non-exhaustive, highly-opinionated list of best practices for websites that focus primarily on text. I don't expect anybody to fully agree with the list; nonetheless, the article should have *some* useful information for any web content author or front-end web developer. My primary focus is inclusive design: => https://100daysofa11y.com/2019/12/03/accommodation-versus-inclusive-design/ Accomodation versus inclusive design. Specifically, I focus on supporting *under-represented ways to read a page*. Not all users load a page in a common web-browser and navigate effortlessly with their eyes and hands. Authors often neglect people who read through accessibility tools, tiny viewports, machine translators, "reading mode" implementations, the Tor network, printouts, hostile networks, and uncommon browsers, to name a few. I list more niches in the conclusion. Compatibility with so many niches sounds far more daunting than it really is: if you only selectively override browser defaults and use semantic HTML, you've done half of the work already. One of the core ideas behind the flavor of inclusive design I present is being *inclusive by default.* Web pages shouldn't use accessible overlays, reduced-data modes, or other personalizations if these features can be available all the time. Of course, some features conflict; you can't display a light and dark color scheme simultaneously. Personalization is a fallback strategy to resolve conflicting needs. Disproportionately under-represented needs deserve disproportionately greater attention, so they come before personal preferences instead of being relegated to a separate lane. Another focus is minimalism. Progressive enhancement is a simple, safe idea that tries to incorporate some responsibility into the design process without rocking the boat too much. On top of progressive enhancement, I prefer limiting any enhancements to ones that are demonstrated to solve specific accessibility, security, performance, or significant usability problems faced by people besides me. I'd like to re-iterate yet another time that this only applies to websites that primarily focus on text. If graphics, interactivity, etc. are an important part of your website, less of the article applies. My hope is for readers to consider a subset of this page the next time they build a website, and address the trade-offs they make when they deviate. I don't expect--or want--anybody to follow all of my advice, because doing so would make the Web quite a boring place! I'll cite the Web Accessibility Initiative's (WAI) "Techniques for WCAG 2.2" a number of times: => https://www.w3.org/WAI/WCAG22/Techniques/ Techniques for WCAG 2.2 Unlike the Web Content Accessibility Guidelines (WCAG), the Techniques document does not list requirements; rather, it serves to non-exhaustively educate authors about *how* to use specific technologies to comply with the WCAG. I don't find much utility in the technology-agnostic goals enumerated by the WCAG without the accompanying technology-specific techniques to meet those goals. ## Security and privacy One of the defining differences between textual websites and advanced Web 2.0 sites/apps is safety. Most browser vulnerabilities are related to modern Web features like JavaScript and WebGL. If that isn't reason enough, most non-mainstream search indexes have little to no support for JavaScript: => ./../../../2021/03/10/search-engines-with-own-indexes.gmi A look at search engines with their own indexes The simplicity of basic textual websites *should* guarantee some extra safety; however, webmasters need to take additional measures to ensure limited use of "modern" risky features. ### TLS All of the simplicity in the world won't protect a page from unsafe content injection by an intermediary. Proper use of TLS protects against page alteration in transit and ensures a limited degree of privacy. Test your TLS setup with these tools: => https://testssl.sh/ testssl.sh => https://webbkoll.dataskydd.net/ Webbkoll If your OpenSSL (or equivalent) version is outdated or you don't want to download and run a shell script, SSL Labs' SSL Server Test should be equivalent to testssl: => https://www.ssllabs.com/ssltest/ SSL Server Test Mozilla's HTTP Observatory offers a subset of Webbkoll's features and is a bit out of date (and requires JavaScript), but it also gives a beginner-friendly score. Most sites should strive for at least a 50, but a score of 100 or even 120 shouldn't be too hard to reach. => https://observatory.mozilla.org/ HTTP Observatory A false sense of security is far worse than transparent insecurity. Don't offer broken TLS ciphers, including TLS 1.0 and 1.1. Vintage computers can run TLS 1.2 implementations such as BearSSL surprisingly efficiently, leverage a TLS terminator, or they can use a plain unencrypted connection. A broken cipher suite is security theater. ### Scripts and the Content Security Policy Consider taking hardening measures to maximize the security benefits made possible by the simplicity of textual websites, starting with script removal. JavaScript and WebAssembly are responsible for the bulk of modern web exploits. Ideally, a text-oriented site can enforce a scripting ban at the Content Security Policy (CSP) level. => https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP CSP on MDN This is the CSP for my main website, with hashes removed for readability: ``` default-src 'none'; img-src 'self' data:; style-src 'sha256-HASH'; style-src-attr 'none'; frame-ancestors 'none'; base-uri 'none'; form-action 'none'; manifest-src https://seirdy.one/manifest.min.HASH.webmanifest; upgrade-insecure-requests; sandbox allow-same-origin ``` "default-src: 'none'" implies "script-src: 'none'", causing a compliant browser to forbid the loading of scripts. Furthermore, the "sandbox" CSP directive forbids a wide variety) of potentially insecure actions. => https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/sandbox "sandbox" CSP directive on MDN While "script-src" restricts script loading, "sandbox" can also restrict script execution with stronger defenses against script injection (e.g. by a browser addon).¹ I added the "allow-same-origin" parameter so that these addons will still be able to function.² If you're able to control your HTTP headers, then use headers instead of a tag. In addition to not supporting certain directives, a CSP in a tag might let some items slip through: > At the time of inserting the meta element to the document, it is possible that some resources have already been fetched. For example, images might be stored in the list of available images prior to dynamically inserting a meta element with an http-equiv attribute in the Content security policy state. Resources that have already been fetched are not guaranteed to be blocked by a Content Security Policy that's enforced late. => https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-http-equiv-content-security-policy HTML Living Standard, Content Security Policy state ### If you must enable scripts Please use progressive enhancement³ throughout your site; every feature possible should be optional, and scripting is no exception. I'm sure you're a great person, but your readers might not know that; don't expect them to trust your website. Your scripts should look as safe as possible to an untrusting eye. Avoid requesting permissions or using sensitive APIs: => https://browserleaks.com/javascript JavaScript Browser Information (BrowserLeaks) Finally, consider using your CSP to restrict script loading. If you must use inline scripts, selectively allow them with a hash or nonce. Some recent directives restrict and enforce proper use of trusted types. => https://web.dev/trusted-types/ Trusted types => https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/trusted-types CSP trusted types on MDN ### Third-party content Third-party content will complicate the CSP, allow more actors to track users, possibly slow page loading, and create more points of failure. Some privacy-conscious users actually block third-party content: while doing so is fingerprintable, it can reduce the amount of data collected about an already-identified user. Some web developers deliver resources using third-party CDNs, such as jsDelivr or Unpkg. Traditional wisdom held that doing so would allow different websites to re-use cached resources; however, all mainstream browsers engines now partition their caches to prevent this behavior: => https://privacycg.github.io/storage-partitioning/ Avoid third-party content, if at all possible. If you must use third-party content, use subresource integrity (SRI): => https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity SRI on MDN => https://www.w3.org/TR/SRI/ SRI specification This prevents alteration without your consent. If you wish to be extra careful, you could use SRI for first-party resources too. Be sure to check the privacy policies for the third party services _and_ subscribe to updates, as their practices could impact the privacy of all your users. For embedded third-party content (e.g. images), give extra consideration to the "Beyond alt-text" section. Your page should be as useful as possible if the embedded content becomes inaccessible. ## Optimal loading Nearly every Internet user has to deal with unreliable connections every now and then, even the top 1%. Developing regions lack modern Internet infrastructure; high-ranking executives travel frequently. Everybody hits the bottom of the bell-curve. Reducing load time is especially useful to poorly-connected users. For much of the world, connectivity comes in short bursts during which loading time is precious. Chances of a connection failure or packet loss increase with time. Optimal loading is a complex topic. Broadly, it covers three overlapping categories: reducing payload size, delivering content early, and reducing the number of requests and round trips. ### Blocking content HTML is a blocking resource: images and stylesheets will not load until the user agent loads and parses the HTML that calls them. To start loading above-the-fold images before the HTML parsing finishes, send a "link" HTTP header. My website includes a "link" header to load an SVG that serves as my IndieWeb photo and favicon (hash removed from filename for readability): ``` link: ; rel=preload; as=image ``` ### Don't count on the cache The browser cache is a valuable tool to save bandwidth and improve page speed, but it's not as reliable as many people seem to believe. Don't focus too much on "repeat view" performance. Out of privacy concerns, most browsers no longer re-use cached content across sites; refer back to the "third-party content" section for more details. Privacy-conscious users (including all users using "private" or "incognito" sessions) will likely have their caches wiped between sessions. Don't focus too much on "repeat view" performance until after your "initial view" performance is good enough. Furthermore: a high number of cached resources can decrease performance of the cache, causing browsers to bypass the cache: => https://simonhearne.com/2020/network-faster-than-cache When Network is Faster than Cache The effect is especially pronounced on low-end phones and mechanical hard drives. One way to help browsers decide which disk-cached resources to prioritize is to use immutable assets. Include the `immutable` directive in your cache-control headers, and cache-bust modified assets by changing their URLs. The only external assets on my pages are images and a web app manifest (for icons); I mark these assets as "immutable" and cache-bust them by including checksums in their filenames: ``` cache-control: max-age=31557600, immutable ``` You can also keep your asset counts low by combining textual assets (e.g. CSS) and inlining small resources. ### Inline content In addition to HTML, CSS is also a blocking resource. You could pre-load your CSS using a "link" header. Alternatively: if your gzipped CSS is under one kilobyte, consider inlining it in the using a