What the data says about how Linux kernel developers collaborate

Breaking down research findings about the values and traits of Linux kernel developers.
569 readers like this.
MySQL 8 is coming

Opensource.com

When I worked in the Open Source Technology Center at Intel, we had quite a few kernel developers on the team, and I was always interested in how they worked so closely with people from a wide variety of companies, including our competitors.

One of the interesting things about the Linux kernel is that the vast majority of people who contribute to it are employed by companies to do this work; however, most of the academic research on open source software assumes that participants are volunteers, contributing because of some personal need or altruistic motivation. Although this is true for some projects, this assumption just isn't valid for projects like the Linux kernel. To learn more, I interviewed 16 kernel developers to talk about how people work together in the kernel.

Here is what I've learned to date.

Employment relationships

Many people consider themselves a Linux kernel developer first, an employee second. Even when they enjoy their current job and like their employer, most of them tend to look at the employment relationship as something temporary, whereas their identity as a kernel developer is viewed as more permanent and more important. Although companies do sometimes influence the areas where their employees contribute, individuals have quite a bit of freedom in how they do the work. Many receive little direction for their day-to-day work, with a high degree of trust from their employers to do useful work. However, occasionally they are asked to do some specific piece of work or to take an interest in a particular area that is important for the company.

Many kernel developers also collaborate with their competitors on a regular basis, where they interact with each other as individuals without focusing on the fact that their employers compete with each other. This was something I saw a lot of when I was working at Intel, because our kernel developers worked with almost all of our major competitors.

Working with others

Each kernel developer works more closely with some people than others. Some are strictly professional relationships, but others develop into friendships. These friendships and professional relationships, along with meeting people in person at conferences, make it easier to collaborate on the mailing lists.

In most cases, people don't seem to be particularly concerned about where other Linux kernel developers work, however, some give volunteer software developers a bit more leeway and help than they would people who are being paid by a company to do similar work.

Physical location is also not particularly important, because collaboration happens on mailing lists where people respond asynchronously, which makes it easy to collaborate across many time zones. Although some people keep track of their key collaborators' time zones to gauge when to expect replies, for the most part, time zones don't seem to really matter much.

Collaboration is on the mailing lists

The focus of my research is on collaboration and which people work together on the kernel. With the Linux kernel, discussions about patches happen on various mailing lists, so it's really the best place to look if you want to understand how people are collaborating. Also, the real work happens on the many subsystem lists, so my focus isn't on the main Linux kernel mailing list (LKML), rather on a few subsystem mailing lists.

This doesn't mean that I'm ignoring the source code. I also look at a person's recent code commits or maintainer status as something that can influence how people work together.

Take the Linux USB mailing list, for example:

Graph showing collaboration channels on the Linux USB mailing list

opensource.com

For example, this image represents replies on the Linux USB mailing list over a two-year period (Oct. 31, 2013 to Oct. 31, 2015). There were more than 8,000 replies to messages, or about 10 per day, not counting messages that weren't replied to. The larger, darker circles represent companies with employees who reply or are replied to the most on the USB mailing list, with darker arrows indicating more email exchanges between two companies. I've added names only for the companies whose employees have the most activity on the mailing list. The little loops show that employees at a company reply to each other (or possibly to themselves). For example, companies such as Texas Instruments and Intel have several people working on USB code who reply to each other.

You also see a lot of activity between competitors. You see the big semiconductor companies, such as Texas Instruments, Intel, and others. You also see lines between employees at the Linux distro companies. There is a strong line between Red Hat and Novell (SUSE) that's a bit hard to see, and there is also a tiny line that you can't really see going from Novell to Canonical.

There was no evidence that people in similar time zones were more likely to collaborate.
For the Linux USB list, I've also run statistical models focused on understanding some things that make people more likely to reply to others. Here are a few highlights:
  • Someone is more likely to reply to a maintainer.
  • People who have recently contributed code are more likely to be replied to or to reply to others.
  • People who recently contributed to the same areas of the code are also more likely to reply to each other.
  • Employees working for the same company tend to reply to each other.
  • There was no evidence that people in similar time zones were more likely to collaborate.

What's next?

I'm getting close to being finished with my PhD, but I still have six to eight months when I will continue looking at a few other mailing lists using various types of statistical models to finish my research. All of this will eventually be published in my dissertation (for anyone who wants to read a couple hundred pages of my academic ramblings about the kernel). After I finish this research, I plan go back to work at a technology company doing something similar to my previous work in a senior open source community role where I can make everyone call me doctor.

Learn more in Dawn Fosters' talk, Collaboration in Kernel Mailing Lists, at Open Source Summit EU, which will be held October 23-26 in Prague.

User profile image.
Consultant at The Scale Factory, Geek, Community Manager, Runner, Reader of Science Fiction, World Traveler, Technology Enthusiast and PhD student at the University of Greenwich in London.

1 Comment

Really interesting, thanks. It would be really interesting to find out about the tools - graphical and other - that you're using in your research.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.