7 infrastructure performance and scaling tools you should be using

Performance made easy with Linux containers

Image by:

CC0 Public Domain

Sysadmins, site reliability engineers (SREs), and cloud operators all too often struggle to feel confident in their infrastructure as it scales up. Also too often, they think the only way to solve their challenges is to write a tool for in-house use. Fortunately, there are options. There are many open source tools available to test an infrastructure's performance. Here are my favorites.

Pbench

Pbench is a performance testing harness to make executing benchmarks and performance tools easier and more convenient. In short, it:

Excels at running micro-benchmarks on large scales of hosts (bare-metal, virtual machines, containers, etc.) while automating a potentially large set of benchmark parameters
Focuses on installing, configuring, and executing benchmark code and performance tools and not on provisioning or orchestrating the testbed (e.g., OpenStack, RHEV, RHEL, Docker, etc.)
Is designed to work in concert with provisioning tools like BrowBeat or Ansible playbooks

Pbench's documentation includes installation and user guides, and the code is maintained on GitHub, where the team welcomes contributions and issues.

Ripsaw

Baselining is a critical aspect of infrastructure reliability. Ripsaw is a performance benchmark Operator for launching workloads on Kubernetes. It deploys as a Kuberentes Operator that then deploys common workloads, including specific applications (e.g., Couchbase) or general performance tests (e.g., Uperf) to measure and establish a performance baseline.

Ripsaw is maintained on GitHub. You can also find its maintainers on the Kubernetes Slack, where they are active contributors.

OpenShift Scale

The collection of tools in OpenShift Scale, OpenShift's open source solution for performance testing, do everything from spinning up OpenShift on OpenStack installations (TripleO Install and ShiftStack Install), installing on Amazon Web Services (AWS), or providing containerized tooling, like running Pbench on your cluster or doing cluster limits testing, network tests, storage tests, metric tests with Prometheus, logging, and concurrent build testing.

Scale's CI suite is flexible enough to both add workloads and include your workloads when deploying to Azure or anywhere else you might run. You can see the full suite of tools on GitHub.

Browbeat

Browbeat calls itself "a performance tuning and analysis tool for OpenStack." You can use it to analyze and tune the deployment of your workloads. It also automates the deployment of standard monitoring and data analysis tools like Grafana and Graphite. Browbeat is maintained on GitHub.

Smallfile

Smallfile is a filesystem workload generator targeted for scale-out, distributed storage. It has been used to test a number of open filesystem technologies, including GlusterFS, CephFS, Network File System (NFS), Server Message Block (SMB), and OpenStack Cinder volumes. It is maintained on GitHub.

Ceph Benchmarking Tool

Ceph Benchmarking Tool (CBT) is a testing harness that can automate tasks for testing Ceph cluster performance. It records system metrics with collectl, and it can collect more information with tools including perf, blktrace, and valgrind. CBT can also do advanced testing that includes automated object storage daemon outages, erasure-coded pools, and cache-tier configurations.

Contributors have extended CBT to use Pbench monitoring tools and Ansible and to run the Smallfile benchmark. A separate Grafana visualization dashboard uses Elasticsearch data generated by Automated Ceph Test.

satperf

Satellite-performance (satperf) is a set of Ansible playbooks and helper scripts to deploy Satellite 6 environments and measure the performance of selected actions, such as concurrent registrations, remote execution, Puppet operations, repository synchronizations and promotions, and more. You can find Satperf on GitHub.

Conclusion

Sysadmins, SREs, and cloud operators face a wide variety of challenges as they work to scale their infrastructure, but luckily there is also a wide variety of tools to help them get past those common issues. Any of these seven tools should help you get started testing your infrastructure's performance as it scales.

Are there other open source performance and scaling tools that should be on this list? Add your favorites in the comments.

A day in the life of an open source performance engineering team

Collaborating with the community enables performance engineering to address the confusion and complexity that come with working on a broad spectrum of products.