Open source software is everywhere these days—which is great—but how can you be sure that you should trust the software you've downloaded to do what you want? The area of software supply chain management—of which this discussion forms a part—is fairly newly visible in the industry but is growing in importance. I'm going to consider a particular example.
First, though, this isn't one of those police dramas where a suspect parcel arrives at the precinct and someone realises just in time that it may be a bomb. What I'm talking about here are open source software packages (although the impact on your application may be similar if you're not sufficiently suspicious). There's a huge conversation to be had about what trust means as a starting point (and I have a forthcoming book on Trust in Computing and the Cloud for Wiley).
For the purpose of this article, say you need a library that provides some cryptographic protocol implementation. What do you need to know, and what are your choices? For now, I'll assume that you've already made what is almost certainly the right choice and gone with an open source implementation (see many of my previous articles for why open source is just best for security), and you don't want to be building everything from source all the time. You need something stable and maintained. What should be your source for a new package?
Option 1 – Use a vendor
There are many vendors out there that provide open source software through a variety of mechanisms—typically subscription. Red Hat, my employer (see the standard disclosure on my blog), is one of them. In this case, the vendor will typically stand behind a particular package's fitness for use, provide patches, etc. This is your easiest and best choice in many cases. There may be times, however, when you want to use a package that is not provided by a vendor or not packaged by your vendor of choice. What do you do then? Equally, what decisions do vendors need to make about how to trust a package?
Option 2 – Delve deeper
This is where things get complex. So complex, in fact, that I'm going to be examining them at some length in my book. In this article, though, I'll try to be brief. I'll start with the assumption that there is a single maintainer of the package and multiple contributors. The contributors provide code (and tests and documentation, etc.) to the project, and the maintainer provides builds—binaries/libraries—for you to consume, rather than you taking the source code and compiling it yourself (which is actually what a vendor is likely to do, though they still need to consider most of the points below). This library provides cryptographic capabilities, so it's fairly safe to assume that you care about its security. You need to consider at least five specific areas in detail, all of them relying on the maintainer to a large degree. (I've used the example of security here, although very similar considerations exist for almost any package.) Take a look at the issues.
- Build: How is the package you are consuming created? Is the build process performed on a "clean" (that is, non-compromised) machine with the appropriate compilers and libraries? (There's a turtles problem here!) If the binary is created with untrusted tools, then how can you trust it at all, and what measures does the maintainer take to ensure the "cleanness" of the build environment? It would be great if the build process is documented as a repeatable build so that those who want to check it can do so.
- Integrity: This is related to build, in that you want to be sure that the source code inputs to the build process—the code coming, for instance, from a Git repository—are what you expect. If, somehow, compromised code is injected into the build process, then you are in a very bad position. You want to know exactly which version of the source code is being used as the basis for the package you are consuming so that you can track features—and bugs. As above, having a repeatable build is a great bonus here.
- Responsiveness: This is a measure of how responsive—or not—the maintainer is to changes. Generally, you want stable features tied to known versions but a quick response to bug and (in particular) security patches. If the maintainer doesn't accept patches in a timely manner, you need to worry about your package's security. You should also be asking questions like, "Is there a well-defined security disclosure of vulnerability management process?" (see my article "Security disclosure or vulnerability management?"). And if so, "Is it followed"?
- Provenance: All code is not created equal, and one of the things a maintainer should be keeping track of is the provenance of contributors. If an unknown contributor with a pseudonymous email address and no history of security functionality contributions suddenly submits a large amount of code in a part of the package that provides particularly sensitive features, this should raise alarm bells. On the other hand, if a group of contributors employed by a company with a history of open source contributions and well-reviewed code submits a large patch, this is probably less troublesome. This is a difficult issue to manage, and there are typically no definite "OK" or "no-go" signs, but the maintainer's awareness and management of contributors and their contributions is an important point to consider.
- Expertise: This is the most tricky. You may have a maintainer who is excellent at managing all the points above but is just not an expert in certain aspects of the contributed code's functionality. As a consumer of the package, however, I need to be sure that it is fit for purpose, and that may include (in the case of the security-related package considered here) being assured that the correct cryptographic primitives are used, that bounds-checking is enforced on byte streams, that proper key lengths are used, or that constant time implementations are provided for particular primitives. This is very hard, and the maintainer's job can easily become a full-time one if they are acting as the expert for a large and/or complex project. Indeed, best practice in such cases is to have a team of trusted, experienced experts who work either as co-maintainers or as a senior advisory group for the project. Alternatively, having external people or organisations (such as industry bodies) perform audits of the project at critical junctures—e.g., when a major release is due or when an important vulnerability is patched, for instance—allows the maintainer to share this responsibility. It's important to note that the project does not become magically "secure" just because it's open source (see "Disbelieving the many eyes hypothesis)," but that the community, when it comes together, can significantly improve the confidence consumers of a project can have in the packages it produces.
Once you consider these areas, you then need to work out how to measure and track each of them. Who is in a position to judge the extent to which any particular maintainer is fulfilling each of the areas? How much can you trust them? These are complex issues and ones that much more needs to be written about, but I am passionate about exposing the importance of explicit trust in computing, particularly in open source. There is work going on around open source supply chain management—for instance, the new Project Rekor—but there is lots of work still to be done.
Remember, though: when you take a package—whether library or executable—please consider what you're consuming, what about it you can trust, and on what assurances that trust is founded.
This article was originally published on Alice, Eve, and Bob and is reprinted with the author's permission.