The open source technology behind Twitter

No readers like this yet.
Twitter birds

Without open source, Twitter wouldn't exist. Every Tweet you send and receive touches open source software on its journey between computers and mobile devices. We were curious about how much open source is used at Twitter. Beyond that, we wanted to discover how open source may influence the culture at Twitter, Inc.

We asked Chris Aniszczyk, Open Source Manager at Twitter, to share the company's open source story. Aniszczyk will be keynoting at this month's LinuxCon, August 29 through 31, in San Diego, CA. His topic: The open source technology behind a Tweet.

See what Aniszczyk (@cra on Twitter) had to say about open source and the open culture at Twitter.

Give us a sneak preview of your upcoming LinuxCon keynote, "The Open Source Technology Behind a Tweet."

On the surface, Twitter is a simple real-time service where the unit currency is 140 character messages called Tweets. A closer look reveals the complexity of the running the service: Over 400 million Tweets are sent out each day. At this scale, you have to deal with some interesting real-time engineering problems. In the keynote, I will provide an insight into how we address these challenges and why we favor open source software to do so. The context of the talk will revolve around tracing the life of a Tweet from our backend to the frontend. In the end, I expect the audience to leave with a better appreciation of open source technology and what happens behind the scenes when a humble Tweet appears in their timelines.

How much open source software is used behind the scenes of a Tweet?

We use a lot of open source software. In my opinion, it’s a no-brainer as open source software allows us to customize and tweak code to meet our fast-paced engineering needs as our service and community grows. When we plan new engineering projects at Twitter, we always make sure to measure our requirements against the capabilities of open source offerings, and prefer to consume open source software whenever it makes sense. Through this method, much of Twitter is now built on open source software, and as a result the open source way is now a pervasive part of our culture. On top of that, there is a positive cycle of teaching and learning within open source communities that we benefit from. We share the majority of our code on GitHub too.

Here are a few concrete examples of open source software we consume:

  • MySQL is heavily used for primary storage of Tweets; we develop our MySQL fork in the open to collaborate with the upstream community.
  • Cassandra, Hadoop, Lucene, Pig and a variety of Apache projects are used within our infrastructure to power services such as analytics and search. We also contribute back to these projects and have sponsored the Apache Software Foundation.
  • Memcached is used heavily in our caching infrastructure to scale our ever-growing traffic; we recently open sourced Twemcache which was heavily inspired by the Memcached code base.

On top of that, we produce a variety of open source software too:

  • Iago is a load generator that we created to help us test services before they encounter production traffic. Iago provides us with capabilities that are uniquely suited for Twitter’s environment and the precise degree to which we need to test our services.
  • Zipkin is a distributed tracing system that we created to help us gather timing data for all the disparate services involved in managing a request to the Twitter API.
  • Scalding is a Scala library that makes it easy to write MapReduce jobs in Hadoop by taking advantage of built-in integration with Scala and the JVM.

I would also like to point out Apache Mesos, which makes it easier to build distributed applications and share data center resources. We use it within Twitter for everything from executing analytics jobs on top of Hadoop, to running Rails applications. It’s really one of the cornerstone technologies at Twitter that underpins everything. You should check out this presentation for more information.

What's it like to work at Twitter? Is the culture influenced by open source?

If you spent any time in the open source community, you are aware that the open exchange of information can have a positive impact on the world. At Twitter we keep this principle in mind every day; each employee has a voice and a chance to innovate. We have raucous weekly all-hands meetings where tough questions are asked and answered because we maintain a culture of openness and trust from the inside out. Furthermore, we established an open source office about a year ago to support a variety of open source organizations that are important to us. We’re grateful to the open source community for their contributions, and want to maintain our healthy, reciprocal relationship.

In terms of engineering culture, Twitter is a real-time and event-driven problem and we shape our engineering culture to be real-time and event-driven. We want a nimble, reactive, and fast-paced engineering culture that can scale as we grow as a company. There are more than 400 million Tweets being sent a day and that’s a lot of Tweets to deliver. We also hold quarterly hackweeks where employees get a week to work on a variety of projects that they are truly passionate about but are not necessarily related to their day to day responsibilities. At times, hackweek results in crazy videos.

What's the most interesting way you've seen someone use Twitter?

There are so many that we have a site dedicated to featuring some of the most interested uses called Twitter Stories. One of my favorites stories features Chris Strouth who Tweeted "shit, I need a kidney" and then got one. In terms of something more timely, there was some serious flooding in Manila and Filipinos turned to Twitter as a lifeline. I’m grateful to work at a company that provides a service that such has positive impact in the world

How do you use the open source way in your everyday life?

If you aren’t aware, I helped create Twitter’s Open Source Office (@TwitterOSS) last year and my day job involves running it. It’s a lot of work but I consider myself lucky that I get to help shape Twitter’s engineering culture by teaching them about the open source way. In my spare time, anything I hack on I like to share on GitHub, which is just an amazing place and has done a lot for the open source community in my opinion.

In a previous life, I was heavily involved with the Eclipse Foundation and lead a plug-in development project. I still sit on the Board of Directors at Eclipse representing over 1000 committers and occasionally commit to the EGit and JGit projects which provide Git support for the Eclipse community.

In the end, it’s amazing to see how far open source has come in the last decade and I’m glad to be part of it. There’s still a lot of work to do, and if you’re interested in helping Twitter’s open source mission, we are always hiring.

Jason Hibbets is a Community Director at Red Hat with the Digital Communities team. He works with the Enable Architect, Enable Sysadmin, Enterprisers Project, and community publications.


Here are a few posts that continue the conversation in other languages:
<li>On - <a href="" target="blank">So viel Open-Source-Software setzt Twitter ein</a></li>
<li>On - <a href="" target="blank">Die Open-Source-Technik hinter Twitter</a></li>

Great article! Worth noting that Twitter's developer platform, is built with Drupal.

Yes! I thought I heard that somewhere. That might be potential for a future post. Drupal powers Twitter development site!

Indeed, we use Drupal for and some internal tools. We are actually looking to expand our usage of Drupal and are hiring. So if you like to hack on Drupal, open source and love Twitter, consider <a href=",Job">joining the flock</a>!

Wonderful insight. Food for thought. How can any platform use the experience learned from those who have implemented open source solutions on a global production scale.

<a href="" target="_new" rel="nofollow">Brian Willingham</a>

What Linux (Redhat or others) does it Twitter use? Any idea?

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License.