6 alternatives to OpsGenie for managing monitoring alerts

Reduce the time from IT problems to fixes without breaking your budget.
187 readers like this.
How to find files in Linux

Lewis Cowles, CC BY-SA 4.0

Note from the Editor: the following is the author's point of view related to the topic of managing monitoring systems.

As organizations move toward a new generation of distributed systems and microservice architecture, the DevOps world finds it increasingly difficult to keep up with the hybrid needs of today's application monitoring, and the alerts it generates. Managing this aspect of IT infrastructure has DevOps professionals turning to up-and-coming serverless methodologies for this purpose.

The software implementing this process ranges from commercial to open source, and expensive to free. Let's start by looking at the problem itself. What makes managing monitoring and alerts so difficult?

Managing monitoring

Managing monitoring and alerts becomes complicated when different organizations, working in different regions, each choose different communication mediums to make their employees and customers comfortable.

Let’s understand this issue a bit more through an example. Take a company which:

  • Has many products that live on various cloud and non-cloud platforms.
  • Uses chat and email services for internal communication.
  • Has support professionals working in different time zones.

Now, if an issue comes up with any of this company's products, the response team should act before the customer (and company) experiences negative effects. There won’t be much of a problem if the response team is immediately there to jump on the issue, but in case they are not, someone from the response team should notify them in some way to reduce the diameter of functional or possible financial losses.

Here's the problem. People are not able to notice and respond to issues all the time. If you send the response team an email or text message, there is a probability that no one on the team will see it before the issue causes significant financial loss. Also, the response team might already be receiving so many email alerts that even if they are available, they may find it difficult to spot the high-impact issues among the smaller ones. In this situation, you should send someone from the response team a distinct alert, such as making a phone call or messaging a pager. However, if you decide to call, you need to know who is actually available, otherwise you might have to call multiple people until you find the response team member who is ready to jump on a ringing phone at that very moment, which can take even longer if your call is at an odd time for their location.

Instead, what you need is a tool that not only monitors your systems but also intelligently manages the alert process for the quickest results possible. A popular commercial option is OpsGenie, and in this article, we will talk about open source alternatives to this proprietary option.

What we want from OpsGenie

OpsGenie is a paid alerting tool that helps organizations achieve a smart alerting and notification process. In addition to on-call rotation management, OpsGenie currently supports notifications to and from almost all existing systems, paid and free. There are many other reasons that it is nice to have in a DevOps environment that includes large amounts of automation, integration with chatbots, and on-call rotation. The need for technical support during an outage is one of the more important reasons to consider OpsGenie for these benefits.

We will focus on only the essential part of open source alerting tools in our comparison with OpsGenie. In many environments, that involves connecting teams by managing the following:

  • Alerts to teams who rely on the service.
  • A dashboard to view system status.
  • Integrations with chat tools and automated response.

Note from the Editor: at the time of publishing, OpsGenie does have a free offering within certain usage. Visit there site for the most up-to-date details related to their services.

Open source alerting tools

There are open source tools that can do everything OpsGenie does that I believe to be essential for managing monitoring systems.

Cabot

Cabot provides all of the necessary features to get a complete monitoring picture of your infrastructure. Cabot supports alerts through phone, email, SMS, HipChat, and Slack. It is written in Python and mostly uses the Django framework. Cabot is independent from Java and other memory-hungry processes, which makes it a stable choice.

Nagios

Nagios Core is free and open source, but its support and some plugins have a cost. Thankfully, Nagios Core on its own is a great option for infrastructure monitoring and alerting. It supports notifications via email and has a few other options as integrations. It also supports user-defined notification mechanisms. If you have an API that can process alerts and send custom notifications to one or more mediums—such as, Slack, HipChat, SMS, etc.—this tool could be a good fit for you.

ngDesk

ngDesk can handle your on-call rotation, automatically escalate alerts when there is no response, and offers a ticketing tool as well. ngDesk is still working on the complete package, so stay tuned to this up-and-coming project.

Open Distro for Elasticsearch

Open Distro for Elasticsearch is a recent addition to the monitoring and alerting landscape. This project supports almost all chatbots, email, and a variety of other alert mechanisms. A complete, pluggable monitoring and alerting module, Open Distro for Elasticsearch is a combination of many tools. With it, you can view alerts in Kibana, so there’s no need to go use a separate tool, and you can get notified the way you want with supported integrations and receivers. Authentication support has been added to Kibana, Elasticsearch, and the other tools grouped in this combo, free of cost, so you can specify who can have view access and to what in your elastic stack.

OpenDuty

Another alerting tool providing big competition to the paid alternatives is OpenDuty. While still in beta, this project already supports SMS, phone calls, email, Slack, HipChat, and various other paid and open source integrations for sending alerts. Integrations with other alerting tools like Nagios are also supported, along with compatibility with the paid alerting tool PagerDuty, most likely to help people migrate.

Prometheus Alertmanager

Alertmanager has the ability to define alert definitions and then route alerts with specific definitions to easily set up integrations. These integrations can then broadcast alerts to endpoint devices which can be silenced by admins if needed. Regardless of its limitations, Alertmanager is still a very good tool for sending push notifications to chat platforms and cell phones.

Wrapping Up

If budget or using only open source software is a top concern, there are plenty of response team alerting options available. Start by taking a look at the weaknesses in your existing setup and pinpointing where your organization drops the ball on IT issues, leading to them escalating to real problems. Doing so makes it easier to choose which tool, or combination of tools, you should implement to best address these gaps. It is okay to use more than one if it helps you get a complete picture of managing your monitoring infrastructure.

iamabhi
I work as Lead DevOps, a programmer. I am an open source enthusiast, blogger, writer. You will find me helping people learn, mostly.

9 Comments

You are right, alerta has potential, after reading through the docs. I was mostly looking for tools that can push alerts directly or have most suitable integrations. I missed looking at this one.

In reply to by Kjetilmjos

Mmmm not sure the research on this is super complete. Some of these "alternatives" haven't been touched in years...

It's worth noting, as a former monitoring nerd, that OpsGenie does a great deal of work beyond any one of these tools. The combination of these could cover the necessary elements if you agree with the basics that you only need a dashboard and alerts. Some additional considerations that are important for me:

- what happens when an alert is repeating aka "flapping?"
- what's the definition of an "incident?" Does it correlate multiple events?
- How does it detect whether something is added or removed from the environment intentionally?

This is a great start to the deep topic of monitoring systems! Thanks for writing it.

I would take it a step further and argue that almost none of the alternatives listed in this article are worth comparison. There is false equivalence here, and I don't think the original author did proper investigation before making comparisons. OpsGenie is an alert aggregator and router, and does absolutely no monitoring of its own - it is designed from the ground up to be fed alerts from a vast array of sources, either via email or API calls, and much is possible once those alerts are received The closest equivalents would be the now-defunct AlertCentral or PagerDuty, and I assume OpenDuty follows in PagerDuty's footsteps. Nothing else on this list that I'm aware of falls into the same problem domain.

Thanks for your comment.

I see your comment comes around in fact in a good way, where I wrote about Opsgenie. This whole article is revolving around alerting, Opsgenie, support inputs from a lot of integrations, which provides utilities like on call rotation schedule, SMS, call response, ack and other supportive actions from slack or hipchat and variety of other tools.
For all other tools, mentioned the idea was to provide a list of opensource tools that can provide something like Ops-genie, of-course not everything can be replicated from something which took a lot of time for an organization to provide and a paid support with allowance of a limited free quota.

Thanks for reminding that I missed few very important lines in that section.

In reply to by Sgee (not verified)

Thanks for sharing

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.