How to analyze corporate contributions to open source projects

No readers like this yet.
wavegraph

Opensource.com

In proprietary software, the company contributes 100% of the code. If you think about a traditional proprietary software product, it has a development community of one: the software company itself. The company’s ability to support that product, to influence the features that come in future versions, and to integrate that product with other products in its ecosystem flows directly from its direct control over the source code and its development.

In open source, it is rare that any one company controls anything close to 100% of the source code; in fact, it is often a sign of a weak open source community if one company dominates a project. The power and the value of the open source development model come from many individual and corporate contributors coming together. Using this thinking, we can look at the collaborative corporate contributions to OpenStack.

Corporate OpenStack contributions: Four key questions

One very basic way to look at the corporate contributions to OpenStack is analyze the aggregate contributions to all of the core projects that make up OpenStack:


Openstack contribution by company
Source: Stackalytics.com

But, as some have pointed out, this can quickly become an exercise in “vanity statistics.” What is the real value to enterprise customers of contributions to the community? Is it best to judge an organization’s participation in a project by raw commits, or is there another measure that better represents involvement. In a multi-part project like OpenStack, the breadth of projects contributed to might also be a telling statistic.

Looking beyond the ranking to this kind of a heat map of participation gives a more nuanced way of considering:

  1. What core projects are particular companies focusing on?
  2. Which companies are participating broadly across the projects?
  3. What are the gaps in OpenStack knowledge and participation of a particular company?
  4. Does a company’s investment in the OpenStack community match the products or services they are selling?

Let’s consider contributions to all of the projects that were considered “core” in OpenStack Havana:

  • Ceilometer (OpenStack Telemetry)
  • Cinder (OpenStack Block Storage)
  • Glance (OpenStack Image Service)
  • Heat (OpenStack Orchestration)
  • Horizon (OpenStack Dashboard)
  • Keystone (OpenStack Identity)
  • Nova (OpenStack Compute)
  • Neutron (OpenStack Networking)
  • Oslo (OpenStack Common Libraries)
  • Swift (OpenStack Object Storage)

What’s a better way to visualize a company’s participation in OpenStack beyond the aggregate rankings? If we take each company’s contribution to the lastest OpenStack release, Havana, (in this case, by number of commits) and express it as a percentage of the total contribution, and then look at it across these projects, the plot for the top ten contributors look like this:

Openstack contribution by project

Source: Stackalytics.com

Participation matters in open source

Perhaps it does not matter if you are using the free OpenStack code on a free Linux distribution. But if you are paying for an OpenStack product, or you are looking to move from a proof-of-concept to a production OpenStack environment, then I believe that community participation really does matter.

It’s not just about who is the top contributor. Does any OpenStack vendor really have the expertise to support your production OpenStack environment? Can an OpenStack vendor be a strategic partner for the long term in driving your requirements into future versions of their OpenStack product? These are questions similar to those that enterprise customers were asking ten years ago when they moved from Linux proof of concepts to running real workloads on Linux systems. And they are questions worth considering again as OpenStack begins to appear in the datacenter.

Originally posted on Red Hat Stack: An OpenStack Blog. Reposted with permission.

Tags
Avatar
Chuck Dubuque is the Director of Product Marketing for Red Hat’s virtualization and OpenStack technologies, including Red Hat Enterprise Virtualization, Red Hat Enterprise Linux OpenStack Platform, and Red Hat Cloud Infrastructure. Prior to joining Red Hat, he spent three years in technology sales and consulting, and eight years in biotechnology marketing and business development.

8 Comments

Can you tell a little bit more about the methods and data used in this article? Did you look at source code commits? What data set did you use for assigning a committer to a company? Is any of that data public? Would you like it to be? (I run FLOSSmole and I'd be happy to take your data set as a donation if you feel like giving it to The Greater Good.) PS, I'm commenting here because comments are closed on the original Red Hat blog posting.

Ok, never mind - I see now that there is a "source" link on the original posting that didn't make it over to this opensource.com site. For those following along at home, likely just me (!), the original link was here: http://www.stackalytics.com/?release=havana&metric=commits&project_type=core

Hi Megan,

We've got a source link back to Stackalytics under each image - did we miss an additional source link? Thanks.

AWESOME - I see it now.

We're maintaining another <a href="http://activity.openstack.org/dash/">dashboard for OpenStack</a> with information about company contribution, including <a href="http://activity.openstack.org/dash/browser/scm-companies.html">commits</a>, <a href="http://activity.openstack.org/dash/browser/its-companies.html">closed tickets</a> and <a href="http://activity.openstack.org/dash/browser/mls-companies.html">participation in mailing lists</a>. A <a href="http://blog.bitergia.com/2013/10/17/the-openstack-havana-release/">summary blog post for Havana</a> is also available.

To do further analysis, the complete databases used (retrieved with <a href="http://metricsgrimoire.github.io">MetricsGrimoire tools</a>) can be downloaded as <a href="http://activity.openstack.org/dash/browser/data/db/">MySQL dumps</a> ready to work, updated daily.

This is fantastic, thank you for sharing!

Great! When I did the initial search for data right before the December holidays, I tried using the Bitergia dataset but the download links to the sources I needed were not working at the time, so this is good to see. I am planning to refresh this analysis for IceHouse so I will definitely look at this dataset again. I will reach out to you for help if I have any questions.

I'm sorry that the links didn't work at that point. If you have any problems in the future, please let us know, and we will solve them asap.

Copyright © 2013 Red Hat, Inc.