Custom Configurations with Prometheus Operator

If you are using Prometheus Operator, you have probably encountered an occasion where you wish that you could declare your own custom configuration.  Currently, there are a couple of use cases that require custom configurations because Prometheus Operator does not cover them.  For example, if your use case is relabeling metrics or black box probing, you will need declare your own custom configuration to accomplish these tasks.

When this article was published, custom configuration relied on a hack rather than a dedicated path method for including custom configurations.  However, since v0.19.0, this hack is not longer needed.  Prometheus Operator now allows you to include additional configs that will be merged with the configs that Prometheus Operator automatically generates.  If you are using this version or later, use the additional scrape configs feature rather than the method described here.

Before going down this rabbit hole, be warned that I am presenting a temporary work around until Prometheus Operator has better support for your use case that requires custom configuration.  Also be aware that this work around relies on hack that isn’t explicitly supported by Prometheus Operator, although the maintainers of Prometheus Operator acknowledge that the work around exists and provide some sparse documentation on how to use it.  Additionally, I have not had an opportunity to figure out how to mount the alerting rules in kubernetes 1.8+.  This issue was reported by Vinayak Saokar (@saokar).  If you can contribute to the solution, please post to the issue.

To understand how this workaround works, it helps to know how Prometheus Operator manages the configuration of each Prometheus instance deployed to your Kubernetes cluster.  When a value is assigned to the serviceMonitorSelector, Prometheus Operator creates a Kubernetes Secret object which contains the base64 encoded prometheus.yaml file used to configure Prometheus.  This secret can be inspected just like any other Kubernetes Secret.  Assuming you have an instance of prometheus deployed to the monitoring namespace via Prometheus Operator, you can inspect the secret with command below.

kubectl -n monitoring get secret -o yaml

You should see a file like the one below:

apiVersion: v1
data:
  configmaps.json:
    'base64 encoded string'
  prometheus.yaml:
    'base64 encoded string'
kind: Secret
metadata:
  annotations:
    generated: "true"
  creationTimestamp: 2037-10-10T20:35:47Z
  labels:
    managed-by: prometheus-operator
  name: prometheus-<my-deployment>
  namespace: monitoring
  resourceVersion: "0000"
  selfLink: /api/v1/namespaces/monitoring/secrets/prometheus-<my-deployment>
  uid: <some-uid>
type: Opaque

Notice that there are two base64 encoded files that are passed as data in the secret.  The first is a configmaps.json file that contains a checksum for the version of prometheus.yaml that passed in the data object of the secret.  If you don’t pass Prometheus Operator a serviceMonitorSelector, you can pass your own secret in place of this one.

Let’s take advantage of this to set up blackbox probing of my website joecreager.com.  Blackbox probing simply means to probe a machine that you can’t instrument internally.  For example, I can’t instrument the applications that make google’s search engine work, but I can probe google.com to see if it is up, and to see how long it takes to respond.  The company Pingdom actually sells premium services to do just that for your own website if you are concerned with things like uptime.

To do the blackbox probing, the first thing that we need is an application to handle the probes.  I am going to use blackbox_exporter.  You can get your own copies of the configuration that I am using for blackbox_exporter here.

To run blackbox_exporter in my prometheus cluster, I am using the deployment.yml and services.yml files below:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: blackbox-exporter
  namespace: default
  labels:
    app: blackbox-exporter
spec:
  replicas: 1
  strategy: 
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0
      maxSurge: 1
  progressDeadlineSeconds: 100
  minReadySeconds: 5
  revisionHistoryLimit: 10
  template:
    metadata:
      labels:
        app: blackbox-exporter
    spec:
      containers:
      - name: blackbox
        image: josephcreager/blackbox_exporter:0.11.0 
        imagePullPolicy: Always
        ports:
        - name: blackbox
          protocol: TCP
          containerPort: 9115
        livenessProbe:
          tcpSocket:
            port: blackbox
          initialDelaySeconds: 10
          timeoutSeconds: 10
          periodSeconds: 10
          successThreshold: 1
          failureThreshold: 2
apiVersion: v1
kind: Service
metadata:
  name: blackbox-exporter
  namespace: default
  labels:
    app: blackbox-exporter
spec:
  type: ClusterIP
  selector:
    app: blackbox-exporter
  ports:
  - name: blackbox
    port: 9115
    protocol: TCP
    targetPort: blackbox

If you copy these files to a directory, you can run kubectl apply -f . in that directory to create the resources in your own cluster or minkube instance.  This particular configuration is very simple, and only checks to see if probe returns 200.

After setting up our exporter, it’s time to create a prometheus instance to record the probe results.  You can get a copy of the files that I will use to create these resources here.  Let’s take care of some of the more straightforward stuff first.  We can create our configMaps, storage, and services before we need to worry about how to manage the secret.  Go ahead and kubectl create -f these files if you are following along.

apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
  name: default
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-standard

The storage.yml file tells kubernetes to provision a persistent disk.  I am using GCE.  Refer to the kubernetes docs on StorageClass if you are using another provisioner.

apiVersion: v1
kind: Service
metadata:
  name: blackbox-prometheus
  namespace: default
  labels:
    #meta-monitoring: TODO enable monitoring of this service by another service
    #monitoring: custom
spec:
  selector:
    prometheus: blackbox 
  type: ClusterIP
  ports:
  - name: prometheus
    port: 9090
    targetPort: web
    protocol: TCP

There isn’t much to say about the services.yml file, but keep the port number 9090 in mind as we will use it later to access the prometheus UI.

kind: ConfigMap
apiVersion: v1
metadata:
  name: prometheus-rulefiles-blackbox
  namespace: default
  labels:
    role: prometheus-rulefiles
    prometheus: blackbox
data:
  recording.rules: |-
  up.rules: |-

I’m not doing much with my configMap.yml yet, but later on it will house any alerting or recording rules that I’d like to have.  It is mainly important because the secret we will create for our custom configuration relies on this file existing.

Finally, we can write our prometheusConfig.yml file.  This file won’t be consumed by kubernetes directly.  Instead, we will pass this file as a secret to kubernetes so that prometheus can consume it for it’s configs.  Normally, I would say don’t share your kubernetes secrets, but in this case, there isn’t much that is secret about this file.

global:
  scrape_interval: 30s
  scrape_timeout: 10s
  evaluation_interval: 30s
alerting:
  alertmanagers:
  - kubernetes_sd_configs:
    - api_server: null
      role: endpoints
      namespaces:
        names:
        - monitoring
    scheme: http
    path_prefix: /
    timeout: 10s
    relabel_configs:
    - source_labels: [__meta_kubernetes_service_name]
      separator: ;
      regex: alertmanager-operated
      replacement: $1
      action: keep
rule_files:
- /etc/prometheus/rules/rules-0/*.rules
scrape_configs:
  - job_name: 'blackbox'
    metrics_path: /probe
    params:
      module: [http_2xx]  # Look for a HTTP 200 response.
    static_configs:
      - targets:
        - http://localhost:8080/
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: blackbox-exporter.default.svc.cluster.local:9115

If we wanted to, we could stop here and make a secrets.yml file that has a base64 encoded version of this file, plus a base64 encoded version of a configmaps.json file.  The first time I set blackbox_exporter up it was for work, and this is how I did it because I was in a crunch for time.  But manually editing and applying the secrets was a huge pain in the neck, so I wised up and wrote a little bash script to take care of generating the secret for me.  Bash may not be able to do floating point multiplication, but it’s great for templating.  I recommend that you use this script as well.  If you are on a mac, you may need to run brew update && brew install shasum.  If you are on linux, you should replace the shasum command with the linux equivalent in this script.

#!/bin/bash

CHECKSUM="$(cat configMap.yml | shasum -a 256)"
if [ "$?" -ne 0 ]; then
  echo "cannot find configMap.yml"
  exit 1
fi

CONFIGMAPS_JSON="{\"items\":[{\"key\":\"monitoring/prometheus-rulefiles-blackbox\",\"checksum\":\""$CHECKSUM"\"}]}"

PROMETHEUS_YAML="$(cat prometheusConfig.yml)"
if [ "$?" -ne 0 ]; then
  echo "cannot find PrometheusConfig.yml"
  exit 1
fi

SECRET="
kind: Secret
apiVersion: v1
metadata:
    name: prometheus-blackbox
    namespace: default
data:
    configmaps.json:
      \""$(echo "$CONFIGMAPS_JSON" | base64)"\"
    prometheus.yaml:
      \""$(echo "$PROMETHEUS_YAML" | base64)"\"
"

cat <<EOF | kubectl apply -f -
$(echo "$SECRET")
EOF

As you can see, I am generating the configmaps.json file and the secret dynamically, and then applying the secret.

Finally, we can create the prometheus instance by running kubectl apply -f on the the file below.  But remember, this only works if you have prometheus-operator running.  If you haven’t set up prometheus-operator, it only takes one minute.

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: blackbox
  namespace: default
spec:
  storage:
    volumeClaimTemplate:
      spec:
        storageClassName: default
        selector:
          matchLabels:
            storage: blackbox-prometheus
        resources:
          requests:
            storage: 10Gi
  resources:
    limits:
      cpu: 0.5
      memory: 0.5Gi
    requests:
      cpu: 0.25
      memory: 0.25Gi
  ruleSelector:
    matchLabels:
      role: prometheus-rulefiles
      prometheus: blackbox

We could also make a public facing interface for our prometheus instance using nginx+basic auth.  But I’m not going to go through the trouble, because we can also use port forwarding to access the prometheus UI.  After applying the prometheus.yml file, run kubectl port-forward prometheus-blackbox-0 9090:9090 to forward the application port to your local machine.  Then, we can visit localhost:9090 to visit the prometheus UI.

Prometheus user interface

The image above shows the probe_http_duration_seconds metrics for my site.  The response times are pretty good.  Eventually, I’ll set up alerting so that I can be notified if my site is unavailable for any reason.  In addition to monitoring my personal website, I use the same strategy to probe critical endpoints at work so that our team can be alerted if any endpoints are unavailable.  Hopefully this information saves you a lot of time as you set up your own blackbox probes.