The following applies to minimal websites that focus primarily on text. It does not apply to websites that have a lot of non-textual content. It also does not apply to websites that focus more on generating revenue or pleasing investors than being good websites.
This is a "living document" that I add to as I receive feedback.
=> https://git.sr.ht/~seirdy/seirdy.one/log/master/item/content/posts/website-best-practices.gmi See the changelog
I realize not everybody's going to ditch the Web and switch to Gemini or Gopher today (that'll take, like, a month at the longest). Until that happens, here's a non-exhaustive, highly-opinionated list of best practices for websites that focus primarily on text. I don't expect anybody to fully agree with the list; nonetheless, the article should have *some* useful information for any web content author or front-end web developer.
My primary focus is inclusive design:
=> https://100daysofa11y.com/2019/12/03/accommodation-versus-inclusive-design/ Accomodation versus inclusive design.
Specifically, I focus on supporting *underrepresented ways to read a page*. Not all users load a page in a common web-browser and navigate effortlessly with their eyes and hands. Authors often neglect people who read through accessibility tools, tiny viewports, machine translators, "reading mode" implementations, the Tor network, printouts, hostile networks, and uncommon browsers. Compatibility with so many niches sounds far more daunting than it really is: if you only selectively override browser defaults and use semantic HTML, you've done half of the work already.
I'd like to re-iterate yet another time that this only applies to websites that primarily focus on text. If graphics, interactivity, etc. are an important part of your website, less of the article applies. My hope is for readers to consider *some* points I make on this page the next time they build a website, and be aware of the trade-offs they make when they deviate. I don't expect--or want--anybody to follow all of my advice, because doing so would make the Web quite a boring place!
I'll cite the document "Techniques for WCAG 2.2" a number of times:
=> https://www.w3.org/WAI/WCAG22/Techniques/ Techniques for WCAG 2.2
Unlike the Web Content Accessibility Guidelines (WCAG), the Techniques document does not list requirements; rather, it serves to non-exhaustively educate authors about *how* to use specific technologies to comply with the WCAG. I don't find much utility in the technology-agnostic goals enumerated by the WCAG without the accompanying technology-specific techniques to meet those goals.
## Security and privacy
One of the defining differences between textual websites and advanced Web 2.0 sites/apps is safety. Most browser vulnerabilities are related to modern Web features like JavaScript and WebGL. The simplicity of basic textual websites *should* guarantee some extra safety; however, webmasters need to take additional measures to ensure limited use of "modern" risky features.
### TLS
All of the simplicity in the world won't protect a page from unsafe content injection by an intermediary. Proper use of TLS protects against page alteration in transit and ensures a limited degree of privacy. Test your TLS setup with these tools:
=> https://testssl.sh/ testssl.sh
=> https://webbkoll.dataskydd.net/ Webbkoll
If your OpenSSL (or equivalent) version is outdated or you don't want to download and run a shell script, SSL Labs' SSL Server Test should be equivalent to testssl:
=> https://www.ssllabs.com/ssltest/ SSL Server Test
Mozilla's HTTP Observatory offers a subset of Webbkoll's features and is a bit out of date, but it also gives a beginner-friendly score. Most sites should strive for at least a 50, but a score of 100 or even 120 shouldn't be too hard to reach.
=> https://observatory.mozilla.org/ HTTP Observatory
A false sense of security is far worse than transparent insecurity. Don't offer broken TLS ciphers, including TLS 1.0 and 1.1. Vintage computers can run TLS 1.2 implementations such as BearSSL surprisingly efficiently, leverage a TLS terminator, or they can use a plain unencrypted connection. A broken cipher suite is security theater.
### Scripts and the Content Security Policy
Consider taking hardening measures to maximize the security benefits made possible by the simplicity of textual websites, starting with script removal.
JavaScript and WebAssembly are responsible for the bulk of modern web exploits. Ideally, a text-oriented site can enforce a scripting ban at the Content Security Policy (CSP) level.
=> https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP CSP on MDN
This is the CSP for my main website:
```
default-src 'none';
img-src 'self' data:;
style-src 'sha256-3U3TNinhti/dtVz2/wuS3peJDYYN8Yym+JcakOiXVes=';
style-src-attr 'none';
frame-ancestors 'none';
base-uri 'none';
form-action 'none';
manifest-src https://seirdy.one/manifest.min.ca9097c5e38b68514ddcee23bc6d4d62.webmanifest;
upgrade-insecure-requests;
sandbox allow-same-origin
```
"default-src: 'none'" implies "script-src: 'none'", causing a compliant browser to forbid the loading of scripts. Furthermore, the "sandbox" CSP directive forbids a wide variety) of potentially insecure actions.
=> https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/sandbox "sandbox" CSP directive on MDN
While "script-src" restricts script loading, "sandbox" can also restrict script execution with stronger defenses against script injection (e.g. by a browser addon).¹ I added the "allow-same-origin" parameter so that these addons will still be able to function.²
If you're able to control your HTTP headers, then use headers instead of a tag. In addition to not supporting certain directives, a CSP in a tag might let some items slip through:
> At the time of inserting the meta element to the document, it is possible that some resources have already been fetched. For example, images might be stored in the list of available images prior to dynamically inserting a meta element with an http-equiv attribute in the Content security policy state. Resources that have already been fetched are not guaranteed to be blocked by a Content Security Policy that's enforced late.
=> https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-http-equiv-content-security-policy HTML Living Standard, Content Security Policy state
### If you must enable scripts
Please use progressive enhancement³ throughout your site; every feature possible should be optional, and scripting is no exception.
I'm sure you're a great person, but your readers might not know that; don't expect them to trust your website. Your scripts should look as safe as possible to an untrusting eye. Avoid requesting permissions or using sensitive APIs:
=> https://browserleaks.com/javascript JavaScript Browser Information (BrowserLeaks)
Finally, consider using your CSP to restrict script loading. If you must use inline scripts, selectively allow them with a hash or nonce. Some recent directives restrict and enforce proper use of trusted types.
=> https://web.dev/trusted-types/ Trusted types
=> https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/trusted-types CSP trusted types on MDN
### Third-party content
Third-party content will complicate the CSP, allow more actors to track users, possibly slow page loading, and create more points of failure. Some privacy-conscious users actually block third-party content: while doing so is fingerprintable, it can reduce the amount of data collected about an already-identified user.
Some web developers deliver resources using third-party CDNs, such as jsDelivr or Unpkg. Traditional wisdom held that doing so would allow different websites to re-use cached resources; however, all mainstream browsers engines now partition their caches to prevent this behavior:
=> https://privacycg.github.io/storage-partitioning/
Avoid third-party content, if at all possible.
If you must use third-party content, ensure that third-party stylesheets and scripts leverage subresource integrity (SRI):
=> https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity SRI on MDN
=> https://www.w3.org/TR/SRI/ SRI specification
This prevents alteration without your consent. If you wish to be extra careful, you could use SRI for first-party resources too.
For embedded third-party content (e.g. images), give extra consideration to the "Beyond alt-text" section. Your page should be as useful as possible if the embedded content becomes inaccessible.
## About fonts
I recommend setting the default font to "sans-serif". Avoid "system-ui": it causes issues among readers whose system fonts don't cover your website's charset.
=> https://infinnie.github.io/blog/2017/systemui.html Never, ever use system-ui as the value of font-family
If you really want, you could use serif instead of sans-serif; however, serif fonts tend to look worse on low-res monitors. Not every screen's DPI has three digits. Accommodate users' default zoom levels by keeping your font size the same as most similar websites.
To ship custom fonts is to assert that branding is more important than user choice. That might very well be a reasonable thing to do; branding isn't evil! That being said, textual websites in particular don't benefit much from branding. Beyond basic layout and optionally supporting dark mode, authors generally shouldn't dictate the presentation of their websites; that should be the job of the user agent. Most websites are not important enough to look completely different from the rest of the user's system.
A personal example: I set my preferred browser font to "sans-serif", and map it to my preferred font in my computer's fontconfig settings. Now every website that uses sans-serif will have my preferred font. Sites with sans-serif blend into the users' systems instead of sticking out.
### But most users don't change their fonts...
The "users don't know better and need us to make decisions for them" mindset isn't without merits; however, in my opinion, it's overused. Using system fonts doesn't make your website harder to use, but it does make it smaller and stick out less to the subset of users who care enough about fonts to change them. This argument isn't about making software easier for non-technical users; it's about branding by asserting a personal preference.
### Can't users globally override stylesheets instead?
It's not a good idea to require users to automatically override website stylesheets to see their preferred fonts. Doing so would break websites that use fonts such as Font Awesome to display vector icons. We shouldn't have these users constantly battle with websites the same way that many adblocking/script-blocking users (myself included) already do when there's a better option.
That being said, many users *do* actually override stylesheets. We shouldn't *require* them to do so, but we should keep our pages from breaking in case they do. Pages following this article's advice will probably work perfectly well in these cases without any extra effort.
### Font fingerprinting concerns
Some people raised fingerprinting concerns when I suggested using the default "sans-serif" font. Websites could see which font this maps to in order to identify users.
I don't know much about fingerprinting, except that you can't do font enumeration or accurately calculate font metrics without JavaScript. Since text-based websites that follow these best-practices don't send requests after the page loads and have no scripts, they shouldn't be able to fingerprint via font identification.
Other websites can still fingerprint via font enumeration using JavaScript. They don't need to stop at seeing what sans-serif maps to: they can see all the available fonts on a user's system, the user's canvas fingerprint, window dimensions, etc. Some of these can be mitigated with Firefox's protections against fingerprinting, but these protections understandably override user font preferences:
=> https://support.mozilla.org/en-US/kb/firefox-protection-against-fingerprinting Firefox's protection against fingerprinting
Ultimately, surveillance self-defense on the web is an arms race full of trade-offs. If you want both privacy and customizability, the web is not the place to look; try Gemini or Gopher instead.
## Against lazy loading
Lazy loading may or may not work. Some browsers, including Firefox and the Tor Browser, disable lazy-loading when the user turns off JavaScript. Turning it off makes sense because lazy-loading, like JavaScript, is a fingerprinting vector. Specifically, it identifies idiosyncratic scrolling patterns:
> Loading is only deferred when JavaScript is enabled. This is an anti-tracking measure, because if a user agent supported lazy loading when scripting is disabled, it would still be possible for a site to track a user's approximate scroll position throughout a session, by strategically placing images in a page's markup such that a server can track how many images are requested and when.
=> https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img#attr-loading : The Image Embed element on MDN, the loading attribute
If you can’t rely on lazy loading, your pages should work well without it. If pages work well without lazy loading, is it worth enabling?
The scope of this article is textual content supplemented by images. In that context, I don't think lazy loading is worthwhile because it often frustrates users on slow connections. I think I can speak for some of these users: mobile data near my home has a number of "dead zones" with abysmal download speeds, and my home's Wi-Fi repeater setup used to result in packet loss rates above 60% (!!).
Users on poor connections have better things to do than idly wait for pages to load. They might open multiple links in background tabs to wait for them all to load at once, and/or switch to another task and come back when loading finishes. They might also open links while on a good connection before switching to a poor connection. For example, I often open several links on Wi-Fi before going out for a walk in a mobile-data dead-zone. A Reddit user reading an earlier version of this article described a similar experience when travelling by train:
=> https://i.reddit.com/r/web_design/comments/k0dmpj/an_opinionated_list_of_best_practices_for_textual/gdmxy4u/ u/Snapstromegon's comment
Unfortunately, pages with lazy loading don't finish loading off-screen images in the background. To load this content ahead of time, users need to switch to the loading page and slowly scroll to the bottom to ensure that all the important content appears on-screen and starts loading. Website owners shouldn't expect users to have to jump through these ridiculous hoops.
A similar attribute that I *do* recommend is the "decoding" attribute. I typically use `decoding="async"` so that image decoding can be deferred.
=> https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Img#attr-decoding decoding on MDN
### Would pre-loading or pre-fetching solve the issues with lazy-loading?
Pre-loading essential resources is fine, but speculatively pre-loading content that the user may or may not request isn’t.
A large number of users with poor connections also have capped data, and would prefer that pages don’t decide to predictively load many pages ahead-of-time for them. Some go so far as to disable this behavior to avoid data overages. Savvy privacy-conscious users also generally disable pre-loading since pre-loading behavior is fingerprintable.
Users who click a link *choose* to load a full page. Loading pages that a user hasn’t clicked on is making a choice for that user. I encourage adoption of “link” HTTP headers to pre-load essential and above-the-fold resources when possible, but doing so does not resolve the issues with lazy-loading: the people who are harmed by lazy loading are more likely to have pre-fetching disabled.
### Can't users on poor connections disable images?
I have two responses:
1. If an image isn't essential, you shouldn't include it inline.
2. Yes, users could disable images. That's *their* choice. If your page uses lazy loading, you've effectively (and probably unintentionally) made that choice for a large number of users.
Nonetheless, expect some readers to have images disabled. Refer to the "Beyond alt-text" section to see how to best support this case.
### Related issues
Pages should finish making all network requests while loading, save for a form submission. This makes it easy to load pages in the background before disconnecting. I singled out lazy-loading, but other factors can violate this constraint.
One example is pagination. It's easier to download one long article ahead of time, but inconvenient to load each page separately. Displaying content all at once also improves searchability. The single-page approach has obvious limits: don't expect users to happily download a single-page novel.
Another common offender is infinite-scrolling. This isn't an issue without JavaScript. Some issues with infinite-scrolling were summed up quite nicely in a single panel on xkcd:
=> https://xkcd.com/1309/ xkcd: Infinite Scrolling
A hybrid between the two is paginated content in which users click a "load next page" link to load the next page below the current page (typically using "dynamic content replacement"). It's essentially the same as infinite scrolling, except additional content is loaded after a click rather than by scrolling. This is only slightly less bad than infinite scrolling; it still has the same fundamental issue of allowing readers to lose their place.
I've discussed loading pages in the background, but what about saving a page offline (e.g. with Ctrl + s)? While lazy-loading won't interfere with the ability to save a complete page offline, some of these related issues can. Excessive pagination and inline scrolling make it impossible to download a complete page without manually scrolling or following pagination links to the end.
## Beyond alt-text
Expect some readers to have images disabled or unloaded. Examples include:
* Blind readers
* Users of metered connections: sometimes they disable all images, and other times they only disable images above a certain size
* People experiencing packet loss who only manage to load a few resources
* Users of textual browsers
Accordingly, follow good practices for alt-text. Concisely summarize the image content the best you can, without repeating the surrounding content. Don't include information that isn't present in the image; I'll cover how to handle supplementary information in the following subsections.
Alt-text is a good start, but we don't have to stop there.
Note: this section does not include examples of its own. If you wish to see examples, look at the blockquotes, code snippets, and linkd images in the official version of this Gemini page. You're probably on it right now; if not, here's the canonical location:
=> gemini://seirdy.one/2020/11/23/website-best-practices.gmi An opinionated list of best practices for textual websites
On Gemini, much of this section applies to varying degrees. I typically employ this approach when linking to e.g. images. Sometimes, I even do this when linking to gemtext or HTML documents.
### Putting images in context
Alt text should be limited to describing content of the image. It lacks context. To make things worse, images can contain a great deal of information. Sighted people can "filter" this information and find areas to focus on; alt text should capture this detail. However, sighted users' understanding of this detail can be informed by surrounding less-essential detail.
Being sighted and loading images can introduce issues of its own. Sometimes, sighted readers might focus on the *wrong* part of an image. How can you give readers the missing context and tell them what to focus on?
The best solution comes in two parts:
1. Before the image, supply context that prepares readers with what to expect.
2. After the image, describe your interpretation of important details.
This is somewhat similar to the way most students in primary and secondary schools are taught to cite evidence in essays. On that note: remember that these are weak norms, not rules. Deviate where appropriate, just as students should as they learn to write.
### Figures
A *figure* is any sort of self-contained information that is referenced by--but somewhat distinct from--body content. Items that make for good figures are often found in floating blocks of print material.
=> https://en.wikipedia.org/wiki/Page_layout#Floating_block Floating block (Wikipedia)
Consider using a