Have a Prometheus setup but want to archive your metrics into OpenTSDB, InfluxDB or something else?

Read on to see how you can do this with sop.

Introducing sop

sop is an open source tool that we built to serve as a swiss-army knife for your monitoring toolbox.

In an ideal world, you’d have a single monitoring solution, perfectly observable software, and centralized highly available time series storage. If you’re not there yet, sop can help you bridge the gap a little. Or a lot.

sop is open source (Apache 2.0 license), and lives on GitHub. We (RapidLoop) develop and maintain sop, which complements our products and services related to cloud-native monitoring solutions. sop is currently in beta.

sop has a few tricks up it’s sleeve. Here’s a 10,000-foot overview:

An overview of sop

Essentially, sop can accept time series data pushed into it from various “inputs”. This is stored internally in an embedded RocksDB database. The stored data can be queried using the Prometheus HTTP API. All incoming data can also be sent to various “outputs”.

  • The data from each input can be (optionally) filtered to select only a subset of the incoming data.
  • This data can then be (optionally) downsampled before storing.
  • At any time, you can have any number of inputs, APIs or outputs running.
  • The HTTP API supports Grafana – you can configure a Grafana “Prometheus” datasource pointing to sop.
  • The HTTP API also supports federation, with external label support.

As shown in the picture, the currently implemented inputs are Prometheus, InfluxDB and NATS streaming server; and the outputs are OpenTSDB, InfluxDB, Prometheus and NATS streaming server. Prometheus is the only API implemented so far, but a subset of InfluxQL or Graphite render API is also a future possibility.

sop creates and manages three RocksDB databases to store all the metrics data. RocksDB provides fast startup times and LZ4 compression, resulting in < 1 byte/sample disk utilization which makes it suitable by itself for long-term archiving – but more on that in another blog post.

Setting up sop

sop is written in Go, and comes as a zero-dependency single binary for 64-bit Linux. The easiest way to get it is from the GitHub releases page:

$ wget https://github.com/rapidloop/sop/releases/download/v1.0-beta.1/sop-1.0-beta.1.tar.gz
$ tar xvf sop-1.0-beta.1.tar.gz
$ cd sop-1.0-beta.1

sop relies on a single configuration file (written in ini-file like TOML). It can print out a sample, default configuration file with the “-p” flag:

$ ./sop -p > sop.cfg

And that’s about all it takes to “setup” sop! Let’s have a look a the configuration file.

Inputs, APIs and Outputs

Let’s edit sop.cfg so that we can send Prometheus v1.x data to OpenTSDB and InfluxDB. You should be able to see most of the snippets below in sop.cfg as comments.

Input

First, the input:

[[input]]
type = "prometheusv1_remote_write"
listen = "0.0.0.0:9097"
filter_exclude = [ '{env="staging-1"}', 'foo{bar="baz"}' ]

This tells sop to listen on port 9097 and accept incoming Prometheus v1.x remote write requests over HTTP. You can use the URL http://your.sop.server:9097/ as a remote write URL in Prometheus (see Prometheus docs here).

The filter_exclude is a list or PromQL selectors – all metrics that match any of these expressions will be dropped silently.

Alternately, you can use filter_include:

filter_include = [ '{env="production"}' ]

The include filter says that only these metrics should be stored. The rest will be dropped silently.

You can only use either the include or the exclude filter (or don’t have any at all). Filters can be configured for each input.

Output

Here are the two output sections, mostly self-explanatory:

[[output]]
type         = "opentsdb"
url          = "http://my.opentsdb.server:4242/"
timeout_secs = 3

[[output]]
type           = "influxdb"
address        = "my.influxdb.server:8086"
protocol       = "http"
database       = "prometheus"
filter_exclude = [ 'up' ]

You can have filters (filter_include and filter_exclude) for any output, same as that for inputs. In this example, the influxdb output discards metrics named “up”.

InfluxDB metrics can be sent over http, https or udp. There are also options for credentials and retention policy (see the sop.cfg file for the full list).

Note that OpenTSDB supports only up to 8 tags, so metrics may be dropped if they have more than 8 labels.

API

Let’s configure an API also, so that we can connect Grafana to sop:

[[api]]
type = "prometheus_http"
listen = "0.0.0.0:9095"
federation_labels = {generator="sop",foo="bar"}

This makes sop start an HTTP API at port 9095. You can use http://your.sop.server:9095 as a Prometheus data source in Grafana. (Note: Grafana does not like trailing / in URLs!)

Federation is available at the /federate endpoint. As shown, you can add any set of labels to all metrics that appear in the federation output. Only the plain text format is currently supported.

Downsampling

Configuring downsampling is rather simple:

[general]
downsample_seconds = 120

This makes sop write metrics data to the main time series database every 120 seconds. Each time it writes out the latest set of values for each metric. You can choose to discard “stale” metrics like Prometheus – values that haven’t been updated in the last, say 4 minutes, are no longer “perpetuated”:

[general]
ttl_seconds = 240

4 minutes of “staleness” is the default.

Running sop

Finally, you can run sop by passing it the path to the sop.cfg configuration file:

$ ./sop sop.cfg
2017/10/10 05:45:34.919778 main.go:76: sop starting: version=1.0-beta1, pid=28936
2017/10/10 05:45:34.919988 main.go:92: started output: influxdb (addr=http://my.influxdb.server:8086, db=prometheus, rp=)
2017/10/10 05:45:34.920052 main.go:92: started output: opentsdb (url=http://my.opentsdb.server:4242/api/put?details)
2017/10/10 05:45:34.983347 main.go:114: started database: path=data (took 63.257879ms)
2017/10/10 05:45:34.983395 main.go:127: started storer: ttl=4m0s, downsample=120s
2017/10/10 05:45:34.983442 main.go:140: started reaper: retain=4320h0m0s, gc=24h0m0s
2017/10/10 05:45:34.983595 main.go:161: started input: prometheus v1 remote write (listen=0.0.0.0:9095)
2017/10/10 05:45:34.983959 main.go:187: started api: prometheus_http (listen=0.0.0.0:9095)
2017/10/10 05:45:34.983987 main.go:201: sop open for business

You should now be able to send Prometheus data to sop and have it filter, downsample, store and then forward it to OpenTSDB and InfluxDB.

The data is stored under a directory called “data”. Change the path in sop.cfg if you need to.

To stop sop, simply hit ^C.

Check It Out Today!

sop can be used to build “metrics pipelines” to store, push, pull, downsample, archive, federate, duplicate and consolidate your time series data. There’s a lot we didn’t cover about sop in article – stay tuned for more blog posts!

The internals of sop are fairly easy to understand and hack on – so if you fancy adding a new type of input or output, go ahead and check out the repo on GitHub. We’re happy to accept contributions!

Check out sop today and tell us what you think! Reach us on twitter @therapidloop.