Installing Cyanite: A Scalable Graphite Storage Backend
I’ve been experimenting with Cyanite to make my Graphite cluster more reliable. The main problem I face is when a data node goes down the Graphite web app, more or less, stops responding to requests. Cyanite is a daemon written in Clojure that runs on the JVM. The daemon is stateless and stores timeseries data in Cassandra.
I found the documentation a bit lacking, so here’s how to setup Cyanite to build a scalable Graphite storage backend.
-
Acquire a Cassandra database cluster. You will need at least Cassandra 3.4. The Makefile tests use Cassandra 3.5. I used Cassandra 3.7 in my experiments which is the current release as of this writing. (Note Cassandra’s new Tick-Tock based release cycle.)
Parts of the documentation indicated that Elasticsearch were required. That is no longer the case. Cyanite must store a searchable index of the metrics it has data points for so that it can resolve glob requests into a list of metrics. Example:
carbon.agents.*.metricsReceived
This is now done in Cassandra using SASI indexes which enable CQL SELECT statements to use the LIKE operator. This is the feature that requires a more recent Cassandra version that you may be running in production.
-
Clone the Cyanite Git repository. There are no tags or releases. However, the rumor at Monitorama 2016 is that Cyanite is a stable and scalable platform. So I just grabbed the master branch.
git clone https://github.com/pyr/cyanite.git
-
Create a Cassandra user depending on your local policy. Import the schema to initially create the keyspace you will use. The schema is found in the repository:
doc/schema.cql
Here, I altered the schema to set the replication factor I wanted. So I created my keyspace like this:
CREATE KEYSPACE IF NOT EXISTS metric WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
I’m only replicating in a Cassandra database that lives in a single data center. No cross data center replication strategies here…yet.
-
Install Leiningen. This is the build system tool used by the Cyanite project. Its very friendly seeming and installs locally into your home directory. This allows you to build JARs and other distributable versions of the code.
-
I need to distribute code as Debian packages for Ubuntu. Fortunately, we have a target to build just that.
$ cd path/to/cyanite/repo $ lein fatdeb
This should produce artifacts in the
target/
directory. -
Install the Cyanite packages. Configure
/etc/cyanite.yaml
to match your storage schema file (fromcarbon-cache.py
) and with the connection information about your Cassandra cluster.An example configuration with additional documentation can be found in the Cyanite repo.
doc/cyanite.yaml
Here is a sanitized version of my config. This required some parsing of the source to find needed options.
1# Retention rules from storage-schema.conf 2engine: 3 rules: 4 '^1sec\.*': [ "1s:14d" ] 5 '^1min\.*': [ "60s:760d" ] 6 '^carbon\..*': [ "60s:30d", "15m:2y" ] 7 default: [ "60s:30d" ] 8 9# IP and PORT where the Cyanite REST API will bind 10api: 11 port: 8080 12 host: 0.0.0.0 13 14# An input, carbon line protocol 15input: 16 - type: carbon 17 port: 2003 18 host: 0.0.0.0 19 20# Store the metric index in Cassandra SASI indexes 21index: 22 type: cassandra 23 keyspace: 'metric' 24 username: XXXXXX 25 password: YYYYYY 26 cluster: 27 - cas-000.foobar.com 28 - cas-001.foobar.com 29 - cas-002.foobar.com 30 31# Time drift calculations. I use / trust NTP. 32drift: 33 type: no-op 34 35# Timeseries are stored in Cassandra 36store: 37 keyspace: 'metric' 38 username: XXXXXX 39 password: YYYYYY 40 cluster: 41 - cas-000.foobar.com 42 - cas-001.foobar.com 43 - cas-002.foobar.com 44 45# Logging configuration. See: https://github.com/pyr/unilog 46logging: 47 level: info 48 console: true 49 files: 50 - "/var/log/cyanite/cyanite.log" 51 overrides: 52 io.cyanite: "debug"
-
Cyanite should be startable at this point. You can test that it accepts carbon line protocol metrics and that they are returned by the Cyanite REST API.
-
Package and install Graphite-API along with the Cyanite Python module. Graphite-API is stripped down version of the Graphite web application that uses plugable finders to search different storage backends as a Flask application. Python’s Pip can easily find these packages. This is a WSGI application so use what you would normally deploy these applications with. I use mod_wsgi with Apache to run this on port 80.
A sample
/etc/graphite-api.yaml
to configure Graphite-API to use the Cyanite plugin and query the local Cyanite daemon.1# Where the graphite-api search index is built 2search_index: /var/tmp/graphite-index 3 4# Plugins to use to find metrics 5finders: 6 - cyanite.CyaniteFinder 7 8# Additional Graphite functions 9functions: 10 - graphite_api.functions.SeriesFunctions 11 - graphite_api.functions.PieFunctions 12 13# Cyanite Specific options 14cyanite: 15 urls: 16 - http://127.0.0.1:8080 17 18time_zone: UTC
My plan here is that I can deploy many of these Cyanite / Graphite-API machines in a load balanced fashion to support my query and write loads. They are completely stateless like any good web application so choose your favorite load balancing technique.
At this point you should have a basic Cyanite setup that is able to answer
normal Graphite queries and ingest carbon metrics. You might want to use a
tool like carbon-c-relay to route metrics into the Cyanite pool. You could
point Grafana directly to the load balanced Graphite-API or use the normal
Graphite web application (if you like the Graphite composer) and list the
Graphite-API load balanced VIP as the single CLUSTER_SERVERS
entry.
This should at least get you going with Cyanite as a Graphite storage backend. There will be much tuning and testing to transform this into a scalable system depending on your exact setup. I am just starting down this path and may have more to share in the future. Or it may blow up on me. Time will tell.
Update 2016/07/19: There are several other Graphite storage backends that I’m aware of. All are Cassandra based.
- https://github.com/EinsamHauer/disthene – A Cyanite compatible project written in Java aiming for performance.
- https://github.com/raintank/metrictank – The friendly folks at RainTank appear to be cooking up a Go codebase for metric storage. A young project but lots of promise here.
What am I missing?