Interview with Slava Akhmechet of RethinkDB

An open source database for realtime applications

Image by : 

opensource.com

The world of databases is no stranger to open source. In fact, many of the world's top companies, projects, and websites run various open source databases behind the scenes.

Because the choice of database has enormous implications for scalability, performance, and how the data itself can be queried, there are numerous options to meet all sorts of potential needs. RethinkDB is an open source database with a specific purpose: serving data to realtime applications, whether those applications are video game backends, financial tools, or analytics suites.

To learn more about RethinkDB, we caught up with Slava Akhmechet. Akhmechet is the founder of RethinkDB, the company with the same name as the open source project. Before founding RethinkDB, he was a systems engineer in the financial industry, working on scaling custom database systems. He is currently a PhD student on leave from a program in Computational Neuroscience at Stony Brook University.

Q&A

Tell us a little bit about RethinkDB. What is it? How is it different than other open source database systems?

RethinkDB is the first open source, scalable database designed from the ground up for the realtime web.

Traditional databases use a query-response database access model. That works well on the web because it maps directly to HTTP's request-response. However, modern marketplaces, streaming analytics apps, multiplayer games, and collaborative web and mobile apps require sending data directly to the client in realtime. For example, when a user changes the position of a button in a collaborative design app, the server has to notify other users that are simultaneously working on the same project. Web browsers support these use cases via WebSockets and long-lived HTTP connections, but adapting database systems to realtime needs still presents a huge engineering challenge.

RethinkDB is the first database that uses an exciting new database access model—instead of polling the database for changes, the developer can tell RethinkDB to continuously push updated query results to applications in realtime. This makes building modern, realtime apps dramatically easier—developers can get a scalable realtime web application app and running in a fraction of the time with fewer engineering resources.

Why was it important to you to pursue an open source license with this project?

We think the world is moving towards more realtime applications and realtime experiences, so the first database product to get this right will be a very important part of most technology stacks for many years to come. We think it's very important for core technology like this to be accessible to everyone so nobody is left behind—students, hobbyists, startups, and companies in developing countries that can't necessarily pay large sums for products and services. Open source is the best vehicle to accomplish this. Everyone can get access to the technology, and RethinkDB can thrive by selling value-added services to larger organizations that aren't as price sensitive.

What are some of the criteria that a user/developer might use to decide what type of database technology is best for the project or application they are trying to create?

This is quite complex—these days there are many options out there, and many different use cases that require different tradeoffs in the database product. When I chose databases for my work before RethinkDB I looked at three categories—use cases, scalability, and maturity.

Many use cases still require ACID transactions (for example, financial applications). This technology is only available in traditional RDBMSes, so for those I'd choose MySQL or Postgres. Other use cases are much more analytics driven, so I'd choose a columnar database like Vertica. Yet more use cases (for modern web apps, mobile apps, and games) require flexible data models, so chose NoSQL systems (like MongoDB and Redis).

For projects that required very high scale I'd choose HBase, Cassandra, and (for lower-end of scale) MongoDB.

Finally, for regulated industries (e.g. HIPAA compliance, financial auditing, etc.) Oracle still reigns supreme because it has the most mature regulatory features.

Over the past two years we noticed the emergence of the realtime use case, and that's where RethinkDB fits in. For developers building realtime applications, we want RethinkDB to be the best product available on the market.

What's on the roadmap? What features is the development team hoping to add in the future?

RethinkDB has been in development for over five years, so it's quite mature. We're shipping RethinkDB 2.0 in the next few weeks—it will be a stability release ready for production use, along with commercial services offerings to help our customers get the most out of the product.

After RethinkDB 2.0 there is still a lot of exciting work left to do. The upcoming releases (post 2.0) will support much more sophisticated realtime push functionality (for example, we're working on restartable feeds), better high availability and automated failover support (via a new Raft implementation we're testing), and more deployment options (e.g. Windows support).

Who's committing to RethinkDB? Where is support coming from, and what interesting use cases are you seeing?

RethinkDB is venture funded, so we have a team of fifteen people working on the product full time. However, we think of ourselves as contributors to the project who happen to be paid. In addition to the core team, RethinkDB has over a hundred contributors from all around the world. The contributions come from hobbyists, students, and many of our customers. People contribute to documentation, ecosystem integration projects to make RethinkDB seamlessly work with out pieces of software, client drivers for many different programming languages, and even core database internals improvements.

Most of the use cases are centered around modern marketplaces, streaming analytics apps, multiplayer games, and collaborative web and mobile apps. Essentially, any time anyone is building anything for the web and wants realtime functionality, RethinkDB is a really great database choice.

How can people learn more about RethinkDB and get involved in the project?

The best way is to go to the project website. There is plenty of information there including tutorials, documentation, technical videos, and example apps. We're also very active on Twitter and IRC (#rethinkdb on freenode), so if anyone has a question they can get an answer within minutes.