LinuxCzar

Engineering Software, Linux, and Observability. The website of Jack Neely.    

Notes on TSDBs

When I was young and learning to program it was a milestone to write a Pizza Menu and Ordering program. Select your pizza, crust, toppings, and see how much that pizza is going to cost. Text mode or graphical and in a number of languages I created this program over and over.

Today, its seems that the milestone to achieve is to write a Time Series Database (TSDB). You can’t be cool without one under your belt. There are, seemingly, quite a lot of them. Being that I’ve spent several years working in Observability focusing on telemetry based monitoring and alerting I’ve come across a lot, and always wished I had some notes. So here we are, TSDBs that I know of and some brief notes. This is mostly for my own mental catalog, but, well, comment if you care!

Circonus / IRONdb

Commercial. SaaS based TSDB as well as drop in replacement for Graphite. Histogram support using a log-linear approach similar to HdrHistogram but slightly simplified. Guarantees on mathematical accuracy. This is one of the few commercial TSDB companies I recommend to folks. I’ve studied their setup quite extensively but never been a real customer. PhDs on staff to make sure the math and analysis is correct. There’s also some illumos and OpenSolaris heritage here and the solution is based on ZFS. I really like the technology stack here and the dedication to correctness/accuracy.

DalmatinerDB

  • URL: https://dalmatiner.io/
  • Data Model: Many time series formats
  • Language: Erlang
  • Database: Custom
  • I’ve Used: No

illumos / Open Solaris / SmartOS heritage shows through. More or less requires ZFS. Riak Core for consensus.

Graphite

  • URL: https://graphiteapp.org/
  • Data Model: Period separated namespace searched with good old fnmatch()
  • Language: Python
  • Database: Whisper files
  • I’ve Used: Yes

Whisper files per metric with the namespace easily mapped to a directory and file path on disk. Python code isn’t fast, but most of the stack has been re-implemented in C and Go. Pair up with carbon-c-relay and go-carbon for scaling. Clustering exists but is not self healing. Metric labels are possible using a MySQL DB but traditional setups just use the dot separated namespacing. Use ZFS with lz4 compression and enable sparse files. Other database backends are possible and exist. This really started the revolution.

InfluxDB

  • URL: https://www.influxdata.com/
  • Data Model: Measurement with key/value tags and fields
  • Language: Go
  • Database: Custom
  • I’ve Used: No

Commercial. Used to be Open Source, then clustering required licenses. Young versions of the database were fairly unreliable, but database issues have since been resolved.

OpenTSDB

  • URL: http://opentsdb.net/
  • Data Model: Key/value
  • Language: Java/JVM
  • Database: Custom
  • I’ve Used: No

OpenTSDB is the Open Source standard. However it requires an HBase stack to scale. Even Histogram support is included.

Prometheus

  • URL: https://prometheus.io
  • Data Model: label/value sets identify unique time series
  • Language: Go
  • Database: Custom
  • I’ve Used: Yes

The cool kid on the block. Data model is much inspired from OpenTSDB. Designed for very limited retention without help and aimed at operational intelligence only. Not always considered a “traditional” TSDB, but very powerful and easy to deploy. Super efficient local custom database. No clustering. Scale by sharding with separate Prometheus instances. Very large metric foot prints or deployments may need something with fewer scaling limits.

PromHouse

Not production. Long term storage with clustering and downsampling for Prometheus.

RRDTool

The original Round Robin Database. This is just some C based CLI tools and libraries (Lua, shell, Perl, Python bindings exist). RRD files are fixed resolution and retention files with automated downsampling. Append only. Graphs created by the rrdtool CLI with RPN style math.

Thanos

  • URL: https://thanos.io/
  • Data Model: Prometheus
  • Language: Go
  • Database: Prometheus TSDB
  • I’ve Used: Yes

Long term storage for Prometheus based on moving Prometheus’s TSDB “blocks” to GCS/S3 or similar object storage. Wrap everything in a StoreAPI to build a federated network of data sources that can be queried by a Query component to provide a unified view of the long term data and multiple Prometheus instances. Very young, but a CNCF project. Makes Prometheus scalable in some ways.

Victoria Metrics

Long term storage and clustering for Prometheus as a remote write adapter. Modified PromQL language.

 Previous  Up  Next


comments powered by Disqus