Practices and expectations that one may have developed in working with conventional software licensing may lead to frustration when confronting open source software. The modest request, "Please, just show me the license" may be met with an unsatisfying response. While sometimes the response is very simple, often, the license information for open source software is more complicated and does not match the expectations set by conventional software licensing.
What's up? Is open source software licensing broken? No. Differences, not just in the type of license terms, but in how the software is developed, lead to differences in how software license information is conveyed. In part, this results from tradeoffs between lawyer convenience and developer convenience.
To say that open source software can be developed 'collaboratively' does not begin to capture the extent to which open source development activities may differ from those for conventionally-licensed software. While there are open source projects that, like conventionally-licensed software, are maintained by a single person or by a small, fixed group, collaboration on open source projects can take occur between a wide range of potential contributors. For example, GitHub's annual Octoverse report for 2019 says that over 350,000 people contributed to the top 1,000 projects). But it is not just the number of contributors that sets this apart from the development of conventionally licensed software. The people contributing to an open source project may have no connection among themselves other than having discovered some shared interest in that software project. Participation may evolve over time. The original developer(s) may move on and leave others to continue the development of the project. All this may take place without planning or an overarching governance organization.
Rather than following prescriptive governance rules, open source collaborative activities can be not merely lightweight, but much more responsively ad hoc than would be expected for conventionally licensed software. Practices concerning open source license information are adapted to such collaborative development.
- The terms in open source licenses facilitate collaborative development by providing the needed permissions—copy, modify, distribute—not just for binaries, but for source, too. The Open Source Definition has proven to be a valuable aid in focusing attention on licenses that meet its requirements.
- License information for open source software is embedded in the source code. When one obtains the source code, one receives the corresponding license information. Imagine, at the scale of millions of contributions each year, could separate license management be at all workable? Also, by embedding the license information in the source code, that license information can reflect license-related details that would be impractical to represent in some separately managed license process. For example, embedding in the source code makes it practical to indicate which license terms apply to which portions of the software.
To illustrate what open source license practices accomplish, consider the following example software project: it began five years ago; 50 contributors have contributed so far; several features have been added by adapting portions of software from other projects; the developer of the original code moved on after three years; several commercial enterprises have come to depend on this software, either in one of their products or in-house; this software could have a future of 5-10 more years if updated to take into account changes in other software and relevant aspects of the computing world.
The course of such a project is readily accommodated by existing, commonly used approaches to representing license information in open source projects. With no advance planning, contributors can come and go from the project; portions of the project have different license terms; commercial enterprises can continue to share the work of maintaining the software with little governance overhead cost, and while retaining the ability to go completely independent with their fork of the software, if cooperation with others falls apart.
In contrast, how would conventional approaches to software licensing have operated to support this development? Would this collaboration even have been possible? Are we going to have a whole license infrastructure to keep track of the applicability of thousands of "master software development and distribution agreements?" Are we going to simplify licensing by having a few companies control everything?
Let's return to the question, "What is the license?" My purpose in talking about the characteristics of open source development is to illustrate that there are important non-legal considerations that contribute to how open source license information is represented. The representation of license information in open source software often does not match the expectations of conventional licensing. But, the differences are not a sign of a broken system. Rather, these are differences that support large scale collaborative development of software, an approach to building software that has proven, over the last two decades, to be remarkably powerful.
What does open source license information look like?
In general, one considers the license terms for each "software component." A software component might be visible to users as an application program, or it might be something less apparent to users, like a library that provides certain functionality when combined with larger programs.
For many software components, the license is simple: one of a dozen of the most common open source licenses applies to all of the software in the component. Beyond those most common licenses, there is a long tail of licenses with text variations that are not frequently used. But, with the guidance of the Open Source Definition, the permissions and restrictions in open source license terms stay within certain bounds.
If you are going to do software development that integrates open source software into other software, then one needs to understand any copyleft terms (such as in the famous GPL family of licenses) that apply to the software being integrated.
For reasons that may be apparent from my discussion of how open source software is developed, license information can be more complicated than a single license.
- While there may be one main "project license" for a software component, there may be portions of the software licensed under other licenses. This may result in different license notices in various parts of the source code.
- Some projects have a practice of putting copyright notices in each source file. Others primarily rely on the presence of one or more files that contain license text.
- Copyright notices give an indication of who might be copyright owners of portions of the software (however, given the variability of copyright notice practices, that indication may be weak).
- The source code from which a software component is built may include software that is not reflected in the resulting component, such as tests or build-related files. This might matter to someone who is using a no-GPL rule (a project might include GPL-licensed files, but not in the files from which the executable program is built).
This fine-grained license information is most efficiently conveyed in the source code, as much of the detail concerns which portions of software certain license information relates to. At the most detailed level, the source code is the license. When the license information is in the source code, that license information can be maintained in the same way as the source code, such as in a version control system, and the information is inherently available to anyone who obtains the source code.
It might seem straightforward to extract the license information from the source code and create a summary of the license terms. However, what might be a good summary for one person or company might be inadequate for another. Different people may focus on different license details. One might want to know exactly which components of the software are under copyleft terms. Someone else might not be concerned about a component-by-component summary. Someone else might want all of the license notices, including every different copyright notice.
What license information details do you want to see? Software development is rich with tools. Tools that scan and extract and report license information exist are an active subject of continuing development. Now, "What is the license?" might be reframed as, "Show me a report of the license information," where that report might include a range of different details depending on what matters to the person requesting the report. At the most detailed level, the source code is the license.
Conventional software licensing and open source software licensing address different worlds—software being built in different ways. Be prepared. Have different expectations.