Getting started with Gnocchi

Getting started with Gnocchi

Gnocchi, which enables storage and indexing of time series data and resources at large scale, is purpose-built for today's huge cloud platforms.

Getting started with Gnocchi
Image by : 

Scott Meyers. Modified by Opensource.com. CC BY-SA 2.0.

Get the newsletter

Join the 85,000 open source advocates who receive our giveaway alerts and article roundups.

Gnocchi is an open source time series database created in 2014 when OpenStack was looking for a highly scalable, fault-tolerant time series database that did not depend on a specialized database (e.g., Hadoop, Cassandra, etc.).

Gnocchi was originally built inside OpenStack, but later moved out of the project, because it was built to be platform-agnostic. Even so, Gnocchi is still used as the main time series backend by this cloud platform; for example, OpenStack Ceilometer leverages Gnocchi's large scalability and high-availability properties to ensure its telemetry is always up and fast.

The problem that Gnocchi solves is storage and indexing of time series data and resources at large scale. Modern cloud platforms are not only huge, but they also are dynamic and potentially multi-tenant. Gnocchi takes all of that into account.

Aggregation

Gnocchi takes a unique approach to time series storage: Rather than storing raw data points, it aggregates them before storing them. This built-in feature is different from most other time series databases, which usually support this mechanism as an option and compute aggregation (average, minimum, etc.) at query time.

Because Gnocchi computes all the aggregations at ingestion, getting the data back is extremely fast, as it just needs to read back the pre-computed results.

The way those data points are aggregated is configurable on a per-metric basis, using an archive policy. An archive policy defines which aggregations to compute and how many aggregates to keep. Gnocchi supports a wild number of aggregation methods, such as minimum, maximum, average, Nth percentile, standard deviation, etc. Those aggregations are computed over a period of time (called granularity) and are kept for a defined timespan. Aggregates are stored in a compressed format, ensuring the data points take as little space as possible.

For example, imagine you define an archive policy to keep the average, minimum, and maximum of a time series with five-minute granularity for 30 days. In that case, Gnocchi will compute the average, minimum, and maximum of the values over five-minute ranges, keeping up to 8,640 points (the number of five-minute aggregates you can have over 30 days).

Getting started

Before installing Gnocchi, you need to decide which measures and aggregate storage drivers you want to use. Gnocchi can leverage highly scalable storage systems, such as Ceph, OpenStack Swift, Redis, or even Amazon S3. If none of those options are available to you, you can still use a standard file system.

You also need a database to index the resources and metrics that Gnocchi will handle—both PostgreSQL and MySQL are supported.

The installation page describes how to set up Gnocchi and write its configuration file. Once this configuration file is written, the gnocchi-upgrade program will set up the storage engine and the database index so they are ready to be used.

Gnocchi is composed of two central services: an HTTP server, providing a REST API, and a metric processing daemon. The former is called gnocchi-api; the latter is named gnocchi-metricd. You need to run both services to use Gnocchi. The REST API will be used by clients to query Gnocchi and write data to it. The metricd service will ingest measures received by the API, compute the aggregates, and store them in the long-term aggregate storage.

Both those services are stateless and therefore horizontally scalable. Contrary to many time series databases, there is no limit on the number of metricd daemons or API endpoints that you can run with Gnocchi. If your load starts to increase, you just need to spawn more daemons to handle the flow of new requests. The same applies if you want to handle high-availability scenarios: just start more Gnocchi daemons on independent servers.

Using the command-line tool

Once Gnocchi is deployed, you can install the Gnocchi client command-line tool using pip (the Python package installer); just type pip install gnocchiclient. It sends requests to the HTTP REST API and formats and prints back the replies.

To organize metrics, Gnocchi provides the notion of resources. A resource can have any number of metrics attached to it:

$ gnocchi resource create server42
+-----------------------+--------------------------------------+
| Field                 | Value                                |
+-----------------------+--------------------------------------+
| created_by_project_id |                                      |
| created_by_user_id    | admin                                |
| creator               | admin                                |
| ended_at              | None                                 |
| id                    | 43164383-6dc3-5034-a675-9a39493ca7df |
| metrics               |                                      |
| original_resource_id  | server42                             |
| project_id            | None                                 |
| revision_end          | None                                 |
| revision_start        | 2017-10-20T13:19:46.902754+00:00     |
| started_at            | 2017-10-20T13:19:46.902642+00:00     |
| type                  | generic                              |
| user_id               | None                                 |
+-----------------------+--------------------------------------+

Gnocchi handles revisions of resources; if any attribute is changed, Gnocchi will record that change and create a new entry in the history of the resource.

Once this resource is created, we can create a new metric for this resource:

$ gnocchi metric create -r server42 cpu
+------------------------------------+------------------------------------------------------------------+
| Field                              | Value                                                            |
+------------------------------------+------------------------------------------------------------------+
| archive_policy/aggregation_methods | std, count, min, max, sum, mean                                  |
| archive_policy/back_window         | 0                                                                |
| archive_policy/definition          | - points: 8640, granularity: 0:05:00, timespan: 30 days, 0:00:00 |
| archive_policy/name                | low                                                              |
| created_by_project_id              |                                                                  |
| created_by_user_id                 | admin                                                            |
| creator                            | admin                                                            |
| id                                 | ab0b5dab-31e9-4760-a58f-0dc324047f9f                             |
| name                               | cpu                                                              |
| resource/created_by_project_id     |                                                                  |
| resource/created_by_user_id        | admin                                                            |
| resource/creator                   | admin                                                            |
| resource/ended_at                  | None                                                             |
| resource/id                        | 43164383-6dc3-5034-a675-9a39493ca7df                             |
| resource/original_resource_id      | server42                                                         |
| resource/project_id                | None                                                             |
| resource/revision_end              | None                                                             |
| resource/revision_start            | 2017-10-20T13:19:46.902754+00:00                                 |
| resource/started_at                | 2017-10-20T13:19:46.902642+00:00                                 |
| resource/type                      | generic                                                          |
| resource/user_id                   | None                                                             |
| unit                               | None                                                             |
+------------------------------------+------------------------------------------------------------------+

This creates a metric named cpu and attaches it to the resource server42. The archive policy used by default for new metrics is low, which is provided by default when installing Gnocchi. The low archive policy stores various aggregation methods for 30 days over five minutes granularity.

This new metric has no measures, so it is time to send a few:

$ gnocchi measures add -m "2017-10-20 12:00@42" -m "2017-10-20 12:03@18" -m "2017-10-20 12:06@56" -r server42 cpu
$  gnocchi measures show -r server42 cpu
+---------------------------+-------------+-------+
| timestamp                 | granularity | value |
+---------------------------+-------------+-------+
| 2017-10-20T12:00:00+02:00 |       300.0 |  30.0 |
| 2017-10-20T12:05:00+02:00 |       300.0 |  56.0 |
+---------------------------+-------------+-------+

The three measures sent are aggregated into two data points for 12:00 and 12:05. By default, the aggregation method shown is mean, but you can request any other aggregation method:

$ gnocchi measures show --aggregation count -r server42 cpu
+---------------------------+-------------+-------+
| timestamp                 | granularity | value |
+---------------------------+-------------+-------+
| 2017-10-20T12:00:00+02:00 |       300.0 |   2.0 |
| 2017-10-20T12:05:00+02:00 |       300.0 |   1.0 |
+---------------------------+-------------+-------+

The count aggregation method computes the number of data points received over each interval.

The default resource type is generic, but you can create your own resource type with any extra number of attributes. Those attributes can be optional and can have specific types to make sure your data is properly structured and easy to consume:

$ gnocchi resource-type create -a ipaddress:string:true network-switch
+----------------------+----------------------------------------------------------+
| Field                | Value                                                    |
+----------------------+----------------------------------------------------------+
| attributes/ipaddress | max_length=255, min_length=0, required=True, type=string |
| name                 | network-switch                                           |
| state                | active                                                   |
+----------------------+----------------------------------------------------------+
$ gnocchi resource create --type network-switch --attribute ipaddress:192.168.2.3 sw18-prs.example.com
+-----------------------+--------------------------------------+
| Field                 | Value                                |
+-----------------------+--------------------------------------+
| created_by_project_id |                                      |
| created_by_user_id    | admin                                |
| creator               | admin                                |
| ended_at              | None                                 |
| id                    | 4fbdf14d-5650-58f5-b232-dc3cf91c28d0 |
| ipaddress             | 192.168.2.3                          |
| metrics               |                                      |
| original_resource_id  | sw18-prs.example.com                 |
| project_id            | None                                 |
| revision_end          | None                                 |
| revision_start        | 2017-10-20T15:08:30.935529+00:00     |
| started_at            | 2017-10-20T15:08:30.935513+00:00     |
| type                  | network-switch                       |
| user_id               | None                                 |
+-----------------------+--------------------------------------+

If the attribute value ever changes, its history can be retrieved:

$ gnocchi resource update --type network-switch --attribute ipaddress:192.168.2.48 sw18-prs.example.com
+-----------------------+--------------------------------------+
| Field                 | Value                                |
+-----------------------+--------------------------------------+
| created_by_project_id |                                      |
| created_by_user_id    | admin                                |
| creator               | admin                                |
| ended_at              | None                                 |
| id                    | 4fbdf14d-5650-58f5-b232-dc3cf91c28d0 |
| ipaddress             | 192.168.2.48                         |
| metrics               |                                      |
| original_resource_id  | sw18-prs.example.com                 |
| project_id            | None                                 |
| revision_end          | None                                 |
| revision_start        | 2017-10-20T15:10:11.903018+00:00     |
| started_at            | 2017-10-20T15:08:30.935513+00:00     |
| type                  | network-switch                       |
| user_id               | None                                 |
+-----------------------+--------------------------------------+

$ gnocchi resource history --format yaml --type network-switch sw18-prs.example.com
- creator: admin
  ended_at: null
  id: 4fbdf14d-5650-58f5-b232-dc3cf91c28d0
  ipaddress: 192.168.2.3
  metrics: {}
  original_resource_id: sw18-prs.example.com
  project_id: null
  revision_end: '2017-10-20T15:10:11.903018+00:00'
  revision_start: '2017-10-20T15:08:30.935529+00:00'
  started_at: '2017-10-20T15:08:30.935513+00:00'
  type: network-switch
  user_id: null
- creator: admin
  ended_at: null
  id: 4fbdf14d-5650-58f5-b232-dc3cf91c28d0
  ipaddress: 192.168.2.48
  metrics: {}
  original_resource_id: sw18-prs.example.com
  project_id: null
  revision_end: null
  revision_start: '2017-10-20T15:10:11.903018+00:00'
  started_at: '2017-10-20T15:08:30.935513+00:00'
  type: network-switch
  user_id: null

The revision fields indicate what time the changes were made. Both the old and the new attribute values of ipaddress are accessible, making it easy to trace how a resource has been modified.

Gnocchi offers other features, like search in resources and metrics, advanced cross-aggregation of metrics, access control list (ACL) management, and more.

Integration with other tools

Gnocchi can integrate with many other tools. For example, it is possible to use its collectd plugin to collect metrics with collectd and send them to Gnocchi.

Gnocchi also supports Grafana, the standard tool for displaying charts from time series databases, through a Grafana plugin.

Gnocchi's roadmap

Gnocchi is actively developed, and new features are always coming. The next version (4.1) should provide more computing possibilities for cross-metric aggregation. Also, an ingestion point for Prometheus is being built so Gnocchi can be used as a long-term and scalable storage option for the Prometheus time series alerting and monitoring system.

About the author

Julien Danjou - Principal Software Engineer at Red Hat, working on OpenStack Telemetry. Free Software hacker for 15+ years, Emacs and Debian developer. Author of The Hacker's Guide to Python.