2022-0185](https://www.openwall.com/lists/oss-security/2022/01/18/7): a Linux 0-day found by the Crusaders of Rust a few weeks ago. It was discovered using the [syzkaller](https://github.com/google/syzkaller) kernel fuzzer. The process was documented on Will's Root:
CVE-2022-0185 - Winning a $31337 Bounty after Pwning Ubuntu and Escaping Google's KCTF Containers by {{}}
I _highly_ encourage giving it a read; it's the perfect example of fuzzing with sanitizers to find a vulnerability, reproducing the vulnerability (by writing a tiny C program), _then_ diving into the source code to find and fix the cause, and finally reporting the issue (with a patch!). When source isn't available, the vendor would assume responsibility for the "find and fix" steps.
The fact that some of the most-used pieces of FLOSS in existence have been the biggest beneficiaries of source-agnostic approaches to vulnerability analysis should be quite revealing. The source code to these projects has received attention from millions of eyes, yet they _still_ invest in fuzzing infrastructure and vulnerability-hunters prefer analyzing artifacts over inspecting the source.
Good counter-arguments
----------------------
I readily concede to several points in favor of source availability from a security perspective:
- Source code can make analysis _easier_ by _supplementing_ source-independent approaches. The lines between the steps I mentioned in the [four-step vulnerability-fixing process](#how-security-fixes-work) are blurry.
- Patching vulnerabilities is important. Source availability makes it possible for the community, package maintainers, or reporters of a vulnerability to patch software. Package maintainers often blur the line between "packager" and "contributor" by helping projects migrate away from abandoned/insecure dependencies. One example that comes to mind is the Python 2 to Python 3 transition for projects like Calibre.[^12] Being able to fix issues independent of upstream support is an important mitigation against [user domestication](../../../../2021/01/27/whatsapp-and-the-domestication-of-users/).
- Some developers/vendors don't distribute binaries that make use of modern toolchain-level exploit mitigations (e.g. PIE, RELRO, stack canaries, automatic variable initialization, [CFI](https://clang.llvm.org/docs/ControlFlowIntegrity.html), etc.[^13]). In these cases, building software yourself with these mitigations (or delegating it to a distro that enforces them) requires source code availability (or at least some sort of intermediate representation).
- Closed-source software may or may not have builds available that include sanitizers and debug symbols.
- Although fuzzing release binaries is possible, fuzzing is much easier to do when source code is available. Vendors of proprietary software seldom release special fuzz-friendly builds, and filtering out false-positives can be quite tedious without understanding high-level design.
- It is certainly possible to notice a vulnerability in source code. Excluding low-hanging fruit typically caught by static code analysis and peer review, it's not the main way most vulnerabilities are found nowadays (thanks to {{}} for [reminding me about what source analysis does accomplish](https://lemmy.ml/post/167321/comment/117774)).
- Software as a Service can be incredibly difficult to analyze, as we typically have little more than the ability to query a server. Servers don't send core dumps, server-side binaries, or trace logs for analysis. Furthermore, it's difficult to verify which software a server is running.[^14] For services that require trusting a server, access to the server-side software is important from both a security and a user-freedom perspective
Most of this post is written with the assumption that binaries are inspectable and traceable. Binary obfuscation and some forms of content protection/DRM violate this assumption and actually do make analysis more difficult.
Beyond source code, transparency into the development helps assure users of compliance with good security practices. Viewing VCS history, patch reviews, linter configurations, etc. reveal the standards that code is being held up to, some of which can be related to bug-squashing and security.
{{}} on Matrix also had a great response, which I agree with and adapt below:
Whether or not the source code is available for software does not change how insecure it is. However, there are good security-related incentives to publish source code.
- Doing so improves vulnerability patchability and future architectural improvement by lowering the barrier to contribution. The fixes that follow can be _shared and used by other projects_ across the field, some of which can in turn be used by the vendor. This isn't a zero-sum game; a rising tide lifts all boats.
- It's generally good practice to assume an attacker has full knowledge of a system instead of relying on security through obscurity. Releasing code provides strong assurance that this assumption is being made. It's a way for vendors to put their money where their mouth is.
Both Patience and {{}} argue that given the above points, a project whose goal is maximum security would release code. Strictly speaking, I agree. Good intentions don't imply good results, but they can _supplement_ good results to provide some trust in a project's future.
Conclusion {#conclusion}
---------------
I've gone over some examples of how analyzing a software's security properties need not depend on source code, and vulnerability discovery in both FLOSS and in proprietary software uses source-agnostic techniques. Dynamic and static black-box techniques are powerful tools that work well from user-space (Zoom) to kernel-space (Linux) to low-level components like Intel ME+AMT. Source code enables the vulnerability-fixing process but has limited utility for the evaluation/discovery process.
Don't assume software is safer than proprietary alternatives just because its source is visible; come to a conclusion after analyzing both. There are lots of great reasons to switch from macOS or Windows to Linux (it's been my main OS for years), but security is [low on that list](https://madaidans-insecurities.github.io/linux.html).
All other things being mostly equal, FLOSS is obviously _preferable_ from a security perspective; I listed some reasons why in the counter-arguments section. Unfortunately, being helpful is not the same as being necessary. All I argue is that source unavailability does not imply insecurity, and source availability does not imply security. Analysis approaches that don't rely on source are typically the most powerful, and can be applied to both source-available and source-unavailable software. Plenty of proprietary software is more secure than FLOSS alternatives; few would argue that the sandboxing employed by Google Chrome or Microsoft Edge is more vulnerable than Pale Moon or most WebKitGTK-based browsers, for instance.
Releasing source code is just one thing vendors can do to improve audits; other options include releasing test builds with debug symbols/sanitizers, publishing docs describing their architecture, and/or just keeping software small and simple. We should evaluate software security through _study_ rather than source model. Support the right things for the right reasons, and help others make informed choices with accurate information. There are enough good reasons to support software freedom; we don't need to rely on bad ones.
[^1]: Writing an alternative or re-implementation doesn't require access to the original's source code, as is evidenced by a plethora of clean-room re-implementations of existing software written to circumvent the need to comply with license terms.
[^2]: Ideally well-documented, non-obfuscated code.
[^3]: Or a JIT compiler, or a [bunch of clockwork](https://en.wikipedia.org/wiki/Analytical_Engine), or...
[^4]: For completeness, I should add that there is one source-based approach that can verify correctness: formal proofs. Functional programming languages that [support dependent types](https://en.wikipedia.org/wiki/Dependent_type) can be provably correct at the source level. Assuming their self-hosted toolchains have similar guarantees, developers using these languages might have to worry less about bugs they couldn't find in the source code. This can alleviate concerns that their language runtimes can make it hard to reason about low-level behavior. Thanks to {{}} for pointing this out.
[^5]: For example: C, C++, Objective-C, Go, Fortran, and others can utilize sanitizers from Clang and/or GCC.
[^6]: This is probably what people in _The Matrix_ were using to see that iconic [digital rain](https://en.wikipedia.org/wiki/Matrix_digital_rain).
[^7]: This command only lists syscall names, but I did eventually follow the example of [sandbox-app-launcher](https://github.com/Whonix/sandbox-app-launcher) by allowing certain syscalls (e.g. ioctl) only when invoked with certain parameters. Also, I used [ripgrep](https://github.com/BurntSushi/ripgrep) because I'm more familiar with PCRE-style capture groups.
[^8]: Decrypting these packets typically involves saving and using key logs, or using endpoints with [known pre-master secrets](https://blog.didierstevens.com/2020/12/14/decrypting-tls-streams-with-wireshark-part-1/).
[^9]: I invite any modders who miss these debug symbols to check out the FLOSS [Minetest](https://www.minetest.net/), perhaps with the [MineClone2](https://content.minetest.net/packages/Wuzzy/mineclone2/) game.
[^10]: See page 127-130 of the Invisible Things Lab's [Quest to the Core slides](https://invisiblethingslab.com/resources/misc09/Quest%20To%20The%20Core%20%28public%29.pdf). Bear in mind that they often refer to AMT running atop ME.
[^11]: As an aside: your security isn't necessarily improved by "disabling" it, since it still runs during the initial boot sequence and does provide some hardening measures of its own (e.g., a TPM).
[^12]: In 2017, Calibre's author actually wanted to stay with Python 2 after its EOL date, and [maintain Python 2 himself](https://bugs.launchpad.net/calibre/+bug/1714107). Users and package maintainers were quite unhappy with this, as Python 2 would no longer be receiving security fixes after 2020. While official releases of Calibre use a bundled Python interpreter, distro packages typically use the system Python package; Calibre's popularity and insistence on using Python 2 made it a roadblock to getting rid of the Python 2 package in most distros. What eventually happened was that community members (especially {{}} and {{}}) submitted patches to migrate Calibre away from Python 2. Calibre migrated to Python 3 by [version 5.0](https://calibre-ebook.com/new-in/fourteen).
[^13]: Linux distributions' CFI+ASLR implementations rely executables compiled with CFI+PIE support, and ideally with stack-smashing protectors and no-execute bits. These implementations are flawed (see [On the Effectiveness of Full-ASLR on 64-bit Linux](https://web.archive.org/web/20211021222659/http://cybersecurity.upv.es/attacks/offset2lib/offset2lib-paper.pdf) and [Brad Spengler's presentation comparing these with PaX's own implementation](https://grsecurity.net/PaX-presentation.pdf)).
[^14]: The [best attempt I know of](https://signal.org/blog/private-contact-discovery/) leverages [Trusted Execution Environments](https://en.wikipedia.org/wiki/Trusted_execution_environment), but for limited functionality using an implementation that's [far from bulletproof](https://en.wikipedia.org/wiki/Software_Guard_Extensions#Attacks).