Vitess: A distributed, cloud-based storage solution

No readers like this yet.
Server room

Cory Doctorow. Modified by CC BY-SA 2.0.

This year at the Percona Live Data Performance Conference, I'll be discussing Vitess. Vitess is an open source storage platform for scaling MySQL databases, which is optimized for use in both the cloud and on dedicated hardware. Vitess was created by YouTube in 2011, and is a distributed, cloud-based storage solution that exhibits some of the best properties of a relational database.

My talk is called Vitess: The Complete Story. The first part of the talk will cover the core features of Vitess. The second part will dive into the details of the API, and how we conceptualized a database engine that uses other database engines for its storage.

My involvement with databases goes back to Informix in the '90s. This was during the 4GL and client-server days. I was part of the development team for a product called NewEra.

I later joined PayPal, where we used Oracle and eventually scaled it to the biggest machine money could buy. These experiences at PayPal influenced the founders of YouTube to try a different approach: scaling with commodity hardware. When I joined YouTube, the only MySQL database we had was just beginning to run out of steam, and we boldly executed the first resharding in our lives. It took an entire night of master downtime, but we survived. These experiences eventually led to the birth of Vitess.

YouTube was growing, not only organically, but also internally. There were more engineers writing code that could potentially impair the database, and our tolerance for downtime was also decreasing. It was obvious that this combination was not sustainable. My colleague (Mike Solomon) and I agreed that we had to come up with something that would leap ahead of the curve instead of just fighting fires. When we finally built the initial feature list, it was obvious that we were addressing problems that are common to all growing organizations.

This led us to make the decision to develop this project as open source, which had a serendipitous payback: every feature that YouTube needed had to be implemented in a generic fashion. App-specific shortcuts were generally not allowed. We still develop every feature in open source first, which we would then import to make it work for YouTube.

Aside from our architectural and design philosophy, our collaboration with Kubernetes over the last two years means anyone can now run Vitess the way YouTube does: in a dynamically-scaled container cluster. We’ve had engineers dedicated to deployment and manageability on a public cloud, making the platform ready for general consumption.

Want to find out more about Vitess, YouTube, or me? Subscribe to the Vitess blog, and check out the Vitess main page, and view the source code on GitHub.

To hear my talk on Vitess: The Complete Story, register for Percona Live Data Performance Conference 2016. Use the code "FeaturedTalk" and receive $100 off the current registration price!

User profile image.
Sugu works on scalability projects at the YouTube infrastructure and storage team. He's currently focused on developing Vitess. Prior to YouTube, he was part of the architecture team at PayPal where he built many of PayPal's core features. Sugu has also done work in development environments, compilers, and computer graphics.

Comments are closed.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.