A beginner's guide for contributing to Apache Cassandra | Opensource.com

A beginner's guide for contributing to Apache Cassandra

Start participating in an open source database project used to power internet services worldwide.

An intersection of pipes.
Image by : 
Opensource.com
x

Subscribe now

Get the highlights in your inbox every week.

Apache Cassandra is an open source NoSQL database trusted by thousands of companies around the globe for its scalability and high availability that does not compromise performance. Contributing to such a widely used distributed system may seem daunting, so this article aims to provide you an easy entry point.

There are good reasons to contribute to Cassandra, such as:

  • Gaining recognition with the Apache Software Foundation (ASF) as a contributor
  • Contributing to an open source project used by millions of people worldwide that powers internet services for companies such as American Express, Bloomberg, Netflix, Yelp, and more
  • Being part of a community adding new features and building on the release of Cassandra 4.0, our most stable in the project's history

How to get started

Apache Cassandra is a big project, which means you will find something within your skillset to contribute to. Every contribution, regardless of how small, counts and is greatly appreciated. An excellent place to start is the Getting Started guide.

The Apache Cassandra project also participates in Google Summer of Code. For an idea of what's involved, please read this blog post by PMC member Paolo Motta.

Choose what to work on

Submitted patches can include bug fixes, changes to the Java codebase, improvements for tooling (Java or Python), documentation, testing, or any other changes to the codebase. Although the process of contributing code is always the same, the amount of work and time it takes to get a patch accepted depends on the kind of issue you're addressing.

Reviewing other people's patches is always appreciated. To learn more, read the Review Checklist. If you are a Cassandra user and can help by responding to some of the questions on the user list, that makes an excellent contribution.

The simplest way to find a ticket to work on is to search Cassandra's Jira for issues marked as Low-Hanging Fruit. We use this label to flag issues that are good starter tasks for beginners. If you don't have a login to ASF's Jira, you'll need to sign up.

A few easy ways to start getting involved include:

  • Testing: By learning about Cassandra, you can add or improve tests, such as CASSANDRA-16191. You can learn more about the Cassandra test framework on our Testing page. Additional testing and Jira-reported bugs or suggestions for improvements are always welcome.
  • Documentation: This isn't always low-hanging fruit, but it's very important. Here's a sample ticket: CASSANDRA-16122. You can find more information on contributing to the Cassandra documentation on our Working on documentation page.
  • Investigate or fix reported bugs: Here's an example: CASSANDRA-16151.
  • Answer questions: Subscribe to the user mailing list, look out for questions you know the answer to, and help others by replying. See the Community page for details on how to subscribe to the mailing list.

These are just four ways to start helping the project. If you want to learn more about distributed systems and contribute in other ways, check the documentation.

What you need to contribute code

To make code contributions, you will need:

  • Java SDK
  • Apache Ant
  • Git
  • Python

Get the code and test

Get the code with Git, work on the topic, use your preferred IDE, and follow the Cassandra coding style. You can learn more on our Building and IDE integration page.

$ git clone https://git-wip-us.apache.org/repos/asf/cassandra.git cassandra-trunk

Many contributors name their branches based on ticket number and Cassandra version. For example:

$ git checkout -b CASSANDRA-XXXX-V.V
$ ant

Test the environment:

$ ant test

Testing a distributed database

When you are done, please, make sure all tests (including your own) pass using Ant, as described in Testing. If you suspect a test failure is unrelated to your change, it may be useful to check the test's status by searching the issue tracker or looking at CI results for the relevant upstream version.

The full test suites take many hours to complete, so it is common to run relevant tests locally before uploading a patch. Once a patch has been uploaded, the reviewer or committer can help set up CI jobs to run the complete test suites.

Additional resources on testing Cassandra include:

Submitting your patch

Before submitting a patch, please verify that you follow Cassandra's Code Style conventions. The easiest way to submit your patch is to fork the Cassandra repository on GitHub and push your branch:

$ git push --set-upstream origin CASSANDRA-XXXX-V.V 

Submit your patch by publishing the link to your newly created branch in your Jira ticket. Use the Submit Patch button.

To learn more, read the complete docs on Contributing to Cassandra. If you still have questions, get in touch with the developer community.


The author wants to thank the Apache Cassandra community for their tireless contributions to the project, dedication to the project users, and continuous efforts in improving the process of onboarding new contributors.

The contributions and dedication of many individuals to the Apache Cassandra project and community have enabled us to reach 4.0—a significant milestone. As we look to the future and seek to encourage new contributors, we want to recognize everyone's efforts since its inception over 12 years ago. It would not have been possible without your help. Thank you!

Person drinking a hot drink at the computer

If you have a small amount of time, you can make a big difference in open source.
A person working.

You can work on the internals of one of the largest open source projects, even in your spare time.
Open here.

You don't need to be a master coder to contribute to open source. Jade Wang shares 8 ways you can...

About the author

Ekaterina Dimitrova - Ekaterina Dimitrova is a Software Engineer at DataStax and an Apache Cassandra committer. Previously, she worked as a Researcher at the Advanced Data Management Technologies Laboratory, University of Pittsburgh. Her work with Professor Panos K. Chrysanthis and Associate Dean Adam J. Lee on “Authorization-aware optimization for multi-provider queries” was published at the Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. Before coming to the US, she also worked for 7 years...