If you are using Prometheus Operator, you have probably encountered an occasion where you wish that you could declare your own custom configuration. Currently, there are a couple of use cases that require custom configurations because Prometheus Operator does not cover them. For example, if your use case is relabeling metrics or black box probing, you will need declare your own custom configuration to accomplish these tasks.
When this article was published, custom configuration relied on a hack rather than a dedicated path method for including custom configurations. However, since v0.19.0, this hack is not longer needed. Prometheus Operator now allows you to include additional configs that will be merged with the configs that Prometheus Operator automatically generates. If you are using this version or later, use the additional scrape configs feature rather than the method described here.
Before going down this rabbit hole, be warned that I am presenting a temporary work around until Prometheus Operator has better support for your use case that requires custom configuration. Also be aware that this work around relies on hack that isn’t explicitly supported by Prometheus Operator, although the maintainers of Prometheus Operator acknowledge that the work around exists and provide some sparse documentation on how to use it. Additionally, I have not had an opportunity to figure out how to mount the alerting rules in kubernetes 1.8+. This issue was reported by Vinayak Saokar (@saokar). If you can contribute to the solution, please post to the issue.
To understand how this workaround works, it helps to know how Prometheus Operator manages the configuration of each Prometheus instance deployed to your Kubernetes cluster. When a value is assigned to the serviceMonitorSelector
, Prometheus Operator creates a Kubernetes Secret
object which contains the base64 encoded prometheus.yaml file used to configure Prometheus. This secret can be inspected just like any other Kubernetes Secret. Assuming you have an instance of prometheus deployed to the monitoring namespace via Prometheus Operator, you can inspect the secret with command below.
kubectl -n monitoring get secret -o yaml
You should see a file like the one below:
apiVersion: v1 data: configmaps.json: 'base64 encoded string' prometheus.yaml: 'base64 encoded string' kind: Secret metadata: annotations: generated: "true" creationTimestamp: 2037-10-10T20:35:47Z labels: managed-by: prometheus-operator name: prometheus-<my-deployment> namespace: monitoring resourceVersion: "0000" selfLink: /api/v1/namespaces/monitoring/secrets/prometheus-<my-deployment> uid: <some-uid> type: Opaque
Notice that there are two base64 encoded files that are passed as data
in the secret. The first is a configmaps.json
file that contains a checksum for the version of prometheus.yaml
that passed in the data
object of the secret. If you don’t pass Prometheus Operator a serviceMonitorSelector
, you can pass your own secret in place of this one.
Let’s take advantage of this to set up blackbox probing of my website joecreager.com. Blackbox probing simply means to probe a machine that you can’t instrument internally. For example, I can’t instrument the applications that make google’s search engine work, but I can probe google.com to see if it is up, and to see how long it takes to respond. The company Pingdom actually sells premium services to do just that for your own website if you are concerned with things like uptime.
To do the blackbox probing, the first thing that we need is an application to handle the probes. I am going to use blackbox_exporter. You can get your own copies of the configuration that I am using for blackbox_exporter here.
To run blackbox_exporter in my prometheus cluster, I am using the deployment.yml
and services.yml
files below:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: blackbox-exporter namespace: default labels: app: blackbox-exporter spec: replicas: 1 strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 0 maxSurge: 1 progressDeadlineSeconds: 100 minReadySeconds: 5 revisionHistoryLimit: 10 template: metadata: labels: app: blackbox-exporter spec: containers: - name: blackbox image: josephcreager/blackbox_exporter:0.11.0 imagePullPolicy: Always ports: - name: blackbox protocol: TCP containerPort: 9115 livenessProbe: tcpSocket: port: blackbox initialDelaySeconds: 10 timeoutSeconds: 10 periodSeconds: 10 successThreshold: 1 failureThreshold: 2
apiVersion: v1 kind: Service metadata: name: blackbox-exporter namespace: default labels: app: blackbox-exporter spec: type: ClusterIP selector: app: blackbox-exporter ports: - name: blackbox port: 9115 protocol: TCP targetPort: blackbox
If you copy these files to a directory, you can run kubectl apply -f .
in that directory to create the resources in your own cluster or minkube instance. This particular configuration is very simple, and only checks to see if probe returns 200.
After setting up our exporter, it’s time to create a prometheus instance to record the probe results. You can get a copy of the files that I will use to create these resources here. Let’s take care of some of the more straightforward stuff first. We can create our configMaps, storage, and services before we need to worry about how to manage the secret. Go ahead and kubectl create -f
these files if you are following along.
apiVersion: storage.k8s.io/v1beta1 kind: StorageClass metadata: name: default provisioner: kubernetes.io/gce-pd parameters: type: pd-standard
The storage.yml
file tells kubernetes to provision a persistent disk. I am using GCE. Refer to the kubernetes docs on StorageClass
if you are using another provisioner.
apiVersion: v1 kind: Service metadata: name: blackbox-prometheus namespace: default labels: #meta-monitoring: TODO enable monitoring of this service by another service #monitoring: custom spec: selector: prometheus: blackbox type: ClusterIP ports: - name: prometheus port: 9090 targetPort: web protocol: TCP
There isn’t much to say about the services.yml
file, but keep the port number 9090
in mind as we will use it later to access the prometheus UI.
kind: ConfigMap apiVersion: v1 metadata: name: prometheus-rulefiles-blackbox namespace: default labels: role: prometheus-rulefiles prometheus: blackbox data: recording.rules: |- up.rules: |-
I’m not doing much with my configMap.yml
yet, but later on it will house any alerting or recording rules that I’d like to have. It is mainly important because the secret we will create for our custom configuration relies on this file existing.
Finally, we can write our prometheusConfig.yml
file. This file won’t be consumed by kubernetes directly. Instead, we will pass this file as a secret to kubernetes so that prometheus can consume it for it’s configs. Normally, I would say don’t share your kubernetes secrets, but in this case, there isn’t much that is secret about this file.
global: scrape_interval: 30s scrape_timeout: 10s evaluation_interval: 30s alerting: alertmanagers: - kubernetes_sd_configs: - api_server: null role: endpoints namespaces: names: - monitoring scheme: http path_prefix: / timeout: 10s relabel_configs: - source_labels: [__meta_kubernetes_service_name] separator: ; regex: alertmanager-operated replacement: $1 action: keep rule_files: - /etc/prometheus/rules/rules-0/*.rules scrape_configs: - job_name: 'blackbox' metrics_path: /probe params: module: [http_2xx] # Look for a HTTP 200 response. static_configs: - targets: - http://localhost:8080/ relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: blackbox-exporter.default.svc.cluster.local:9115
If we wanted to, we could stop here and make a secrets.yml
file that has a base64 encoded version of this file, plus a base64 encoded version of a configmaps.json file. The first time I set blackbox_exporter up it was for work, and this is how I did it because I was in a crunch for time. But manually editing and applying the secrets was a huge pain in the neck, so I wised up and wrote a little bash script to take care of generating the secret for me. Bash may not be able to do floating point multiplication, but it’s great for templating. I recommend that you use this script as well. If you are on a mac, you may need to run brew update && brew install shasum
. If you are on linux, you should replace the shasum
command with the linux equivalent in this script.
#!/bin/bash CHECKSUM="$(cat configMap.yml | shasum -a 256)" if [ "$?" -ne 0 ]; then echo "cannot find configMap.yml" exit 1 fi CONFIGMAPS_JSON="{\"items\":[{\"key\":\"monitoring/prometheus-rulefiles-blackbox\",\"checksum\":\""$CHECKSUM"\"}]}" PROMETHEUS_YAML="$(cat prometheusConfig.yml)" if [ "$?" -ne 0 ]; then echo "cannot find PrometheusConfig.yml" exit 1 fi SECRET=" kind: Secret apiVersion: v1 metadata: name: prometheus-blackbox namespace: default data: configmaps.json: \""$(echo "$CONFIGMAPS_JSON" | base64)"\" prometheus.yaml: \""$(echo "$PROMETHEUS_YAML" | base64)"\" " cat <<EOF | kubectl apply -f - $(echo "$SECRET") EOF
As you can see, I am generating the configmaps.json
file and the secret dynamically, and then applying the secret.
Finally, we can create the prometheus instance by running kubectl apply -f
on the the file below. But remember, this only works if you have prometheus-operator running. If you haven’t set up prometheus-operator, it only takes one minute.
apiVersion: monitoring.coreos.com/v1 kind: Prometheus metadata: name: blackbox namespace: default spec: storage: volumeClaimTemplate: spec: storageClassName: default selector: matchLabels: storage: blackbox-prometheus resources: requests: storage: 10Gi resources: limits: cpu: 0.5 memory: 0.5Gi requests: cpu: 0.25 memory: 0.25Gi ruleSelector: matchLabels: role: prometheus-rulefiles prometheus: blackbox
We could also make a public facing interface for our prometheus instance using nginx+basic auth. But I’m not going to go through the trouble, because we can also use port forwarding to access the prometheus UI. After applying the prometheus.yml
file, run kubectl port-forward prometheus-blackbox-0 9090:9090
to forward the application port to your local machine. Then, we can visit localhost:9090
to visit the prometheus UI.
The image above shows the probe_http_duration_seconds metrics for my site. The response times are pretty good. Eventually, I’ll set up alerting so that I can be notified if my site is unavailable for any reason. In addition to monitoring my personal website, I use the same strategy to probe critical endpoints at work so that our team can be alerted if any endpoints are unavailable. Hopefully this information saves you a lot of time as you set up your own blackbox probes.