I’ve been experimenting with Thanos over the last few months to add long term storage of Prometheus metric data and a single query endpoint for a large Prometheus fleet. This looks like a really well designed solution to end the 30-ish day retention limits and to solve the oft asked question “What host name is my Prometheus box?”
Thanos requires that each Prometheus VM or machine in the fleet have a unique set of external labels. These are special labels that are added to metrics as they leave a Prometheus VM for any reason and help identify the source. As Prometheus is usually sharded by function or team and HA/DR is implemented by having a pair of identically configured Prometheus VMs in each shard, it is recommended to use two external labels:
role: This identifies the function or team this Prometheus VM serves. I use
monitorinstead here, but same difference.
replica: This is an index or FQDN of the VM and is used so that each VM in a single
roleis unique. Do your self a favor and use the FQDN here.
This was the model I deployed. Then came the headdesk moment! This didn’t completely uniquely identify all my Prometheus VMs, and Thanos said so, loudly. To Thanos, I had non-unique, but overlapping TSDB blocks as well as non-unique Prometheus VMs. I had already let each Prometheus VM upload data (those TSDB blocks) to Google Compute Storage (GCS). I found myself in a bit of a pickle.
In my Prometheus fleet, I shard by team usually, and teams have a pair of PROD
Prometheus VMs and, at their option, a non-production Prometheus VM for their
testing of rules and such. I had forgotten to account for this and the
replica="0" version of the production and non-production VMs were
conflicting. Thus, another bit of wisdom to use FQDNs in the
Fixing this was two-fold. The first step was easy. Well, easier than the second step. First, each Prometheus VM needed to be reconfigured to have a third external label present. The Thanos Sidecar component also needed a good restart to find these changes.
promenv: The environment of the Prometheus Role.
This was purposely not named
environment. Users and teams monitor production
and non-production targets from a single Prometheus Role. It turns out teams
really do want production monitoring system quality for monitoring of
non-production targets. So relabeling rules attach an
environment label to
each metric at scrape time to add the namespacing of the environment of the
target. But this means that the environment of the Prometheus Role doesn’t
indicate the environment of the target scrape jobs.
Secondly, each stored TSDB block in GCS had the old set of external labels
attached by Thanos and these needed to be updated. Thanos stores these by
adding them to the
meta.json file in each TSDB block. So this isn’t too
crazy. I ended up writing a bit of Go code to work this through.
This code is passed a GCS bucket name either with or without the
prefix. The value for the
promenv label is encoded into the GCS bucket
name, so some regular expression magic validates the bucket name follows my
patterns and finds the correct
promenv value. If the
safety flag is given, it will then find any
meta.json object in the given
bucket, parse it, and see if a
promenv external label exists. If that
label exists, the code moves on to the next TSDB block. If that label is
absent, then it is added to the JSON, marshaled back into text, and uploaded.
Backups of the metadata files are kept on local disk.
See my rewritemeta command source on Github!
$ ./rewritemeta -confirm gs://bruce-thanos-lts-global-dev/ 2019/05/26 21:59:13 GCS Bucket : bruce-thanos-lts-global-dev 2019/05/26 21:59:13 Promenv Value: dev 2019/05/26 21:59:38 Found: 01DBV5YHD90P9YPEE955XPJX60/meta.json 2019/05/26 21:59:38 Writing backup to: 01DBV5YHD90P9YPEE955XPJX60-meta.json 2019/05/26 21:59:38 Uploading modifed gs://bruce-thanos-lts-global-dev/01DBV5YHD90P9YPEE955XPJX60/meta.json 2019/05/26 21:59:38 Writing backup to: 01DBV5YHD90P9YPEE955XPJX60-meta.json.fixed 2019/05/26 21:59:39 Found: 01DBVCT8NARQPC1FTZ1SBCRNPN/meta.json 2019/05/26 21:59:39 Writing backup to: 01DBVCT8NARQPC1FTZ1SBCRNPN-meta.json 2019/05/26 21:59:39 Uploading modifed gs://bruce-thanos-lts-global-dev/01DBVCT8NARQPC1FTZ1SBCRNPN/meta.json 2019/05/26 21:59:39 Writing backup to: 01DBVCT8NARQPC1FTZ1SBCRNPN-meta.json.fixed
The only caveat is that this code does not format the
meta.json files after
modification exactly in the same order that Prometheus and Thanos do. Its
JSON so it will parse, but I did find it surprising that the data wasn’t in the
same order. After running this on a bucket you will need to restart any Thanos
Store and Compact components that are running against this bucket.
I thought others might find something like this useful if they ever need to change the external labels previously set in Thanos.