Post open source software, licensing and GitHub

SHARE
Image credits: acme
submit to reddit
 
(20 votes)

Few would deny that the rise of GitHub as a popular hosting service for software projects is one of the most significant developments to affect open source during the past five years. GitHub's extraordinary success is necessary context for understanding the criticism leveled at it during the past year from some within or close to the open source world. This criticism has focused on licensing, or rather the lack of it: it is claimed that GitHub hosts an enormous amount of code with no explicit software license. Some critics have suggested that this situation results from a combination of the ignorance of younger developers about legal matters and willful inaction by GitHub's management. In a followup article I will discuss the measures recently taken by GitHub to address these concerns; this article explores aspects of the complaint itself.

The coinage of "POSS"

Last September James Governor of RedMonk issued his celebrated tweet about younger GitHub-using developers being "about POSS - post open source software", characterized by disdain or disregard for both licensing and governance. I do not believe that there is such a thing as "POSS" in the sense Governor apparently meant. However, for convenience, in this article I will use the term "POSS" in the way Donnie Berkholz (also of RedMonk) did in his recent analysis of Ohloh data, to mean seemingly-open-source projects for which explicit licensing appears to be absent.

An interesting and mostly negative discussion about POSS was sparked by Governor's tweet. Stephen Walli, calling GitHub a "promiscuous site", argued that without greater attention to software "hygiene" developers risked the loss of participation. Simon Phipps asserted that GitHub was not taking open source seriously and was creating an environment of heightened legal risk. Mark Radcliffe called POSS a "disturbing trend", noting that it would serve only to frustrate the presumed desire of developers for wide use of their code. Both Phipps and Luis Villa suggested that POSS was something that would only benefit lawyers.

No explicit license = "all rights reserved"?

A number of GitHub's critics have pointed out that the status of code without a license is that no copyright permission is granted at all. This is essentially correct, assuming the code is copyrightable to begin with, but it overlooks certain details I consider relevant.

One is a general historical observation. The free software movement was launched primarily by hackers in the U.S., operating, until March 1, 1989, in a legal regime that made it very easy to cause copyrightable material to enter the public domain (in a very precise sense). The predecessors of post-1998 public forge sites were Usenet newsgroups through which developers commonly shared source code during the 1980s until the mid-1990s. Casual inspection of archives of these newsgroups indicates that even after U.S. entry into Berne and the initial popularization of what we would now think of as the GPL, MIT and BSD license families, such code typically was accompanied by no copyright or license notice. The practice of noninclusion of explicit licensing on what was in a social sense free software continued well after Usenet ceased being a common medium for software sharing in favor of FTP archive sites and still later web-based project forge sites.

Indeed, in all the criticism of GitHub and POSS I have seen no replicable data analysis showing that POSS has increased, let alone begun, with the rise of GitHub. The historical view might suggest that the practice of non-explicitly-licensed code that is commonly assumed to be free software is actually an enduring tradition with roots that predate the 1976 Copyright Act in the U.S. (which established that copyright attached at fixation rather than upon publication with proper notice). This tradition has only gradually been replaced by the use of copyleft and permissive FLOSS licenses, in a manner not perfectly synchronized with the changes that occurred in U.S. copyright law. I have wondered whether the critics of GitHub may be looking at the issue backwards, perhaps unaware of how common legally-informal code sharing was in developer communities prior to 2008 (the year of GitHub's launch).

Another detail worth mentioning when considering the equation of POSS with "all rights reserved" is something I have spoken of in other contexts. The legal operation of open source development (and many other domains of net-distributed gratis digital content) cannot be properly understood without assuming the existence of a robust doctrine of implied copyright licensing. In U.S. copyright law, an implied license may arise from circumstances creating a reasonable expectation that the copyright holder intends a work to be used for some purpose. With respect to a public GitHub source code repository with no explicit licensing, there is a strong argument under U.S. law that a broad copyright license arises by implication, given the project author's understanding that git and GitHub's proprietary services, by their design, facilitate widespread code sharing and code improvement, including public sharing outside of GitHub itself.

For GitHub, though, implied copyright licensing theories can only go so far. It is difficult to see how, without special facts, the implied license for a public GitHub repo would encompass all of the permissions associated with open source norms, particularly rights to commercialize and engage in more than mere development use. It is also not helpful that GitHub has told its critics that it intends for users' repositories to be "all rights reserved" by default and appears to have viewed this as a way of protecting users with limited legal knowledge. While GitHub's terms of service require users to agree to allow others to "view and fork" their public repositories, this gives rise to a license that seems more restrictive than what ought to occur by implication.

POSS as failed attempts at ultra-permissive licensing?

It is telling that we cannot easily discuss the POSS phenomenon without assuming that developers intended their code to be open source, "all rights reserved" arguments notwithstanding. But if this is so, then the risk associated with such code specifically because of the absence of explicit licensing is necessarily low. Ignoring unrealistic parade-of-horribles hypotheticals about future copyright trolls, one might ask whether POSS developers, far from being naive or ignorant of legal matters, are engaging in an extreme version of the price discrimination that underlies so-called dual-licensing open source business models.

Intuition based in part on the characteristics of non-POSS GitHub code suggests otherwise. Among explicitly-licensed GitHub repositories, noncopyleft licensing is far more prevalent than copyleft licensing, with the minimalist MIT license being the likely most popular license choice. This, plus the shift commonly assumed to be taking place towards noncopyleft licensing, particularly among the sort of younger, web-oriented developers that appear to form a substantial core of GitHub's user base, suggests that if POSS is really prevalent on GitHub, it probably signals a desire to share code on terms as permissive as possible — to the extent the developer gave thought to the matter at all.

POSS may therefore usefully be seen as part of a larger contemporary cultural phenomenon. It is one which includes the introduction of CC0 (2009) and the Unlicense (2010) and a widely-perceived decline in copyleft open source license market share. Luis Villa has proposed that POSS be interpreted as a naive effort by developers to critique the assumption in the orthodox open licensing ecosystem that sharing cannot, or should not, be done without explicit permission. Villa suggests that authors and evaluators of open licenses should be exploring legally sound ways of accommodating this pushback against what Lawrence Lessig calls the "permission culture."

GitHub's legal empowerment of users

The discussion of POSS and GitHub has ignored a positive aspect to the phenomenon. One of the consequences of GitHub's ease of use is that it has become easy to propose and make legal improvements to GitHub-hosted repositories. I have observed that many GitHub users are submitting "please add a license" issues and pull requests. In some cases these are coming from employees of corporate users and packagers from legally-scrupulous community distros like Fedora. Often the result is an apologetic comment from the repository owner, asking for feedback on which license to choose and education on how to handle the mechanics. In quite a few cases it appears that the problem was not absence of licensing but rather that the explicit license was not obvious to the user (for example, because no standalone license file was used, which is hardly a majority practice).

The picture one gets when looking at these cases is not so much that explicit licensing is not occurring, but rather that it is occurring publicly, later than project launch and at the point at which the project begins to attract significant interest. The issues and pull requests often express a preemptive preference for a particular license type (invariably noncopyleft). This is a very significant development: it may be that for the first time in the history of free software users are participating in the license selection process.

""
Released under CC0.

5 Comments

robinmuilwijk's picture
Open Source Sensei

Thanks for sharing your insight/view on the licensing 'issue' at GitHub. It gives me another, more complete, view of it.

wdavis's picture
Open Enthusiast

A good read! I liked your final point in particular on the legal empowerment of the users.

Example: https://github.com/kanthvallampati/IVORY/issues/1

I wonder if there is any data or research into willingness to contribute dependent on whether there is a license listed, a license file, and the license type.

Perhaps if a license.txt isn't detected, a dialogue should trigger letting the contributor know, and suggesting they post it as an issue for resolution (or non-resolution).

Either way I think it's a great opportunity for contributors and developers to engage with the project, steer the project if they care enough to, and help educate others about licenses and licensing.

Gary Welsh's picture

Learning more than one languages is increasing with the bit of time. And this process is no longer difficult. SLA has given us many theory in this regard.

Michael Eager's picture

One of the purposes, as I understand it, with the Berne Convention and current US copyright law was to specify exactly the ways in which a copyright could be transferred or licensed, and eliminate most implied copyright license. The default presumption when something is published, no matter whether on GitHub or in the local newspaper, is that all rights are reserved. It may be possible to rebut this in some way, but this has to be on an individual basis for each author and each separate work of authorship, not a vague "social contract" argument which applies whether the author was aware of it or not. The approach that GitHub is taking, that all submissions are "all rights reserved" unless otherwise specified, seems to be following the intent of the author protections in the Berne Convention.

wille's picture

I think GitHub should ask the user to choose a license when creating a repositor. It would avoid a lot of problems.