OpenStack Live attendees will have several opportunities to hear Amrith Kumar speak. Kumar, the founder and CTO of Tesora, will give three talks: Replication and Clustering with OpenStack Trove; Deploying, Configuring, and Operating OpenStack Trove; and An introduction to Database as a Service with an emphasis on OpenStack using Trove.
In this interview, he provides an update on the Trove project and explains important management considerations for databases in the cloud.
Why is it important to have an open source DBaaS implementation for OpenStack?
Databases are complex, and as a result, proper configuration and maintenance are essential to proper operation. As more enterprises migrate to OpenStack, the demand for databases in OpenStack is increasing. In many cases, the success or failure of a cloud migration project will depend upon how well databases operate in the cloud.
DBaaS provides significant benefits to OpenStack operators and their customers, while dramatically improving the utility that the OpenStack deployment can provide. For these reasons, a DBaaS solution is an essential part of OpenStack.
Tell us a little bit about Trove. How is the project progressing? What's new?
From its humble beginnings in 2012, the OpenStack Trove project now boasts an active contributor community of more than 160 individuals representing 33 companies. In the Juno release alone, contributions were received from 79 individuals representing 20 companies. The project now has in excess of 300,000 lines of code and supports many relational and non-relational databases.
The Juno release—only the second release of Trove—featured replication for MySQL and clustering for MongoDB. In Kilo, we are extending the replication functionality and adding support for new databases, including CouchDB, Vertica, and IBM DB2.
Without giving too much away, give us a preview of your talks at OpenStack Live.
We will go into details of the new replication support and demonstrate the capability. Also, we'll preview some new capabilities and the roadmap for upcoming releases.
Why is replication and clustering important?
Replication and clustering are essential to the proper operation of databases and maintaining data integrity and availability. They ensure that data is protected and accessible even in the face of failures (network, storage, etc.).
What kinds of applications take advantage of these technologies?
Scalable and distributed applications that are operating in production and manipulating "real" data require guarantees of data integrity and availability. These applications can now be assured that data written to the Trove database will be resistant to failures.
Do the applications have to be aware of the underlying replication system?
That's the beauty of a DBaaS offering. The application need not know of the existence of replication; it is up to the operator to ensure that replication is configured and running properly, and that when there are failures, they are properly handled.
What are some important management considerations for databases in the cloud?
Databases are notoriously difficult to configure and manage, and therefore repeatability is extremely important. That one forgotten step, that one innocuous error in configuration, has a nasty habit of turning into a crisis at the very worst moment.
Also, there are a number of database technologies that an enterprise may wish to use. It is more and more difficult for an IT organization to have deep expertise in all of them. By embodying the best practices for each database into the Trove framework, DBaaS make databases easier, safer, and more cost-effective to consume in the cloud.
This article is part of the Speaker Interview Series for OpenStack Live. OpenStack Live is a conference which is designed to teach attendees about the best practices and performance considerations for operating OpenStack, taking place in Santa Clara, California on April 13 and 14, 2015.