Deploying Prometheus Operator with Grafana on Linode Kubernetes Engine

Updated by Linode Written by Ben Bigger

Contribute on GitHub

Report an Issue | View File | Edit File

In this guide, you will deploy the Prometheus Operator to your Linode Kubernetes Engine (LKE) cluster using Helm, either as:

The Prometheus Operator Monitoring Stack

When administrating any system, effective monitoring tools can empower users to perform quick and effective issue diagnosis and resolution. This need for monitoring solutions has led to the development of several prominent open source tools designed to solve problems associated with monitoring diverse systems.

Since its release in 2016, Prometheus has become a leading monitoring tool for containerized environments including Kubernetes. Alertmanager is often used with Prometheus to send and manage alerts with tools such as Slack. Grafana, an open source visualization tool with a robust web interface, is commonly deployed along with Prometheus to provide centralized visualization of system metrics.

The community-supported Prometheus Operator Helm Chart provides a complete monitoring stack including each of these tools along with Node Exporter and kube-state-metrics, and is designed to provide robust Kubernetes monitoring in its default configuration.

While there are several options for deploying the Prometheus Operator, using Helm, a Kubernetes “package manager,” to deploy the community-supported the Prometheus Operator enables you to:

  • Control the components of your monitoring stack with a single configuration file.
  • Easily manage and upgrade your deployments.
  • Utilize out-of-the-box Grafana interfaces built for Kubernetes monitoring.

Before You Begin

Note
This guide was written using Kubernetes version 1.17.
  1. Deploy an LKE Cluster. This guide was written using an example node pool with three 2 GB Linodes. Depending on the workloads you will be deploying on your cluster, you may consider using Linodes with more available resources.

  2. Install Helm 3 to your local environment.

  3. Install kubectl to your local environment and connect to your cluster.

  4. Create the monitoring namespace on your LKE cluster:

    kubectl create namespace monitoring
    
  5. Create a directory named lke-monitor to store all of your Helm values and Kubernetes manifest files and move into the new directory:

    mkdir ~/lke-monitor && cd ~/lke-monitor
    
  6. Add the Google stable Helm charts repository to your Helm repos:

    helm repo add stable https://kubernetes-charts.storage.googleapis.com/
    
  7. Update your Helm repositories:

    helm repo update
    
  8. (Optional) For public access with HTTPS and basic auth configured for your web interfaces of your monitoring tools:

    • Purchase a domain name from a reliable domain registrar and configure your registrar to use Linode’s nameservers with your domain. Using Linode’s DNS Manager, create a new Domain for the one that you have purchased.

    • Ensure that htpasswd is installed to your local environment. For many systems, this tool has already been installed. Debian and Ubuntu users will have to install the apache2-utils package with the following command:

      sudo apt install apache2-utils
      

Prometheus Operator Minimal Deployment

In this section, you will complete a minimal deployment of the Prometheus Operator for individual/local access with kubectl Port-Forward. If you require your monitoring interfaces to be publicly accessible over the internet, you can skip to the following section on completing a Prometheus Operator Deployment with HTTPS and Basic Auth.

Deploy Prometheus Operator

In this section, you will create a Helm chart values file and use it to deploy Prometheus Operator to your LKE cluster.

  1. Using the text editor of your choice, create a file named values.yaml in the ~/lke-monitor directory and save it with the configurations below. Since the control plane is Linode-managed, as part of this step we will also disable metrics collection for the control plane component:

    Caution
    The below configuration will establish persistent data storage with three separate 10GiB Block Storage Volumes for Prometheus, Alertmanager, and Grafana. Because the Prometheus Operator deploys as StatefulSets, these Volumes and their associated Persistent Volume resources must be deleted manually if you later decide to tear down this Helm release.
    ~/lke-monitor/values.yaml
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    
    # Prometheus Operator Helm Chart values for Linode Kubernetes Engine minimal deployment
    prometheus:
      prometheusSpec:
        storageSpec:
          volumeClaimTemplate:
            spec:
              storageClassName: linode-block-storage-retain
              resources:
                requests:
                  storage: 10Gi
    
    alertmanager:
      alertmanagerSpec:
        storage:
          volumeClaimTemplate:
            spec:
              storageClassName: linode-block-storage-retain
              resources:
                requests:
                  storage: 10Gi
    
    grafana:
      persistence:
        enabled: true
        storageClassName: linode-block-storage-retain
        size: 10Gi
    
    # Disable metrics for Linode-managed Kubernetes control plane elements
    kubeEtcd:
      enabled: false
    
    kubeControllerManager:
      enabled: false
    
    kubeScheduler:
      enabled: false
        
  2. Export an environment variable to store your Grafana admin password:

    Note
    Replace prom-operator in the below command with a secure password and save the password for later reference.
    export GRAFANA_ADMINPASSWORD="prom-operator"
    
  3. Using Helm, deploy a Prometheus Operator release labeled lke-monitor in the monitoring namespace on your LKE cluster with the settings established in your values.yaml file:

    helm install \
    lke-monitor stable/prometheus-operator \
    -f ~/lke-monitor/values.yaml \
    --namespace monitoring \
    --set grafana.adminPassword=$GRAFANA_ADMINPASSWORD \
    --set prometheusOperator.createCustomResource=false
    
    Note

    You can safely ignore messages similar to manifest_sorter.go:192: info: skipping unknown hook: "crd-install" as discussed in this Github issues thread.

    Alternatively, you can add --set prometheusOperator.createCustomResource=false to the above command to prevent the message from appearing.

  4. Verify that the Prometheus Operator has been deployed to your LKE cluster and its components are running and ready by checking the pods in the monitoring namespace:

    kubectl -n monitoring get pods
    

    You should see a similar output to the following:

      
    NAME                                                     READY   STATUS    RESTARTS   AGE
    alertmanager-lke-monitor-prometheus-ope-alertmanager-0   2/2     Running   0          45s
    lke-monitor-grafana-84cbb54f98-7gqtk                     2/2     Running   0          54s
    lke-monitor-kube-state-metrics-68c56d976f-n587d          1/1     Running   0          54s
    lke-monitor-prometheus-node-exporter-6xt8m               1/1     Running   0          53s
    lke-monitor-prometheus-node-exporter-dkc27               1/1     Running   0          53s
    lke-monitor-prometheus-node-exporter-pkc65               1/1     Running   0          53s
    lke-monitor-prometheus-ope-operator-f87bc9f7c-w56sw      2/2     Running   0          54s
    prometheus-lke-monitor-prometheus-ope-prometheus-0       3/3     Running   1          35s
        
    

Access Monitoring Interfaces with Port-Forward

  1. List the services running in the monitoring namespace and review their respective ports:

    kubectl -n monitoring get svc
    

    You should see an output similar to the following:

      
    NAME                                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                     AGE
    alertmanager-operated                     ClusterIP   None            <none>        9093/TCP,9094/TCP,9094/UDP  115s
    lke-monitor-grafana                       ClusterIP   10.128.140.155  <none>        80/TCP                      2m3s
    lke-monitor-kube-state-metrics            ClusterIP   10.128.165.34   <none>        8080/TCP                    2m3s
    lke-monitor-prometheus-node-exporter      ClusterIP   10.128.192.213  <none>        9100/TCP                    2m3s
    lke-monitor-prometheus-ope-alertmanager   ClusterIP   10.128.153.6    <none>        9093/TCP                    2m3s
    lke-monitor-prometheus-ope-operator       ClusterIP   10.128.198.160  <none>        8080/TCP,443/TCP            2m3s
    lke-monitor-prometheus-ope-prometheus     ClusterIP   10.128.121.47   <none>        9090/TCP                    2m3s
    prometheus-operated                       ClusterIP   None            <none>        9090/TCP                    105s
        
    

    From the above output, the resource services you will access have the corresponding ports:

    ResourceService NamePort
    Prometheuslke‑monitor‑prometheus‑ope‑prometheus9090
    Alertmanagerlke‑monitor‑prometheus‑ope‑alertmanager9093
    Grafanalke‑monitor‑grafana80
  2. Use kubectl port-forward to open a connection to a service, then access the service’s interface by entering the corresponding address in your web browser:

    Note
    Press control+C on your keyboard to terminate a port-forward process after entering any of the following commands.
    • To provide access to the Prometheus interface at the address 127.0.0.1:9090 in your web browser, enter:

      kubectl -n monitoring \
      port-forward \
      svc/lke-monitor-prometheus-ope-prometheus \
      9090
      
    • To provide access to the Alertmanager interface at the address 127.0.0.1:9093 in your web browser, enter:

      kubectl -n monitoring \
      port-forward \
      svc/lke-monitor-prometheus-ope-alertmanager  \
      9093
      
    • To provide access to the Grafana interface at the address 127.0.0.1:8081 in your web browser, enter:

      kubectl -n monitoring \
      port-forward \
      svc/lke-monitor-grafana  \
      8081:80
      

      Log in with the username admin and the password you exported as $GRAFANA_ADMINPASSWORD. The Grafana dashboards are accessible at Dashboards > Manage from the left navigation bar.

Prometheus Operator Deployment with HTTPS and Basic Auth

Note
Before you start on this section, ensure that you have completed all of the steps in Before you Begin.

This section will show you how to install and configure the necessary components for secure, path-based, public access to the Prometheus, Alertmanager, and Grafana interfaces using the domain you have configured for use with Linode.

An Ingress is used to provide external routes, via HTTP or HTTPS, to your cluster’s services. An Ingress Controller, like the NGINX Ingress Controller, fulfills the requirements presented by the Ingress using a load balancer.

To enable HTTPS on your monitoring interfaces, you will create a Transport Layer Security (TLS) certificate from the Let’s Encrypt certificate authority (CA) using the ACME protocol. This will be facilitated by cert-manager, the native Kubernetes certificate management controller.

While the Grafana interface is natively password-protected, the Prometheus and Alertmanager interfaces must be secured by other means. This guide covers basic authentication configurations to secure the Prometheus and Alertmanager interfaces.

If you are completing this section of the guide after completing a Prometheus Operator Minimal Deployment, you can use Helm to upgrade your release and maintain the persistent data storage for your monitoring stack.

Install the NGINX Ingress Controller

In this section, you will install the NGINX Ingress Controller using Helm, which will create a NodeBalancer to handle your cluster’s traffic.

  1. Install the Google stable NGINX Ingress Controller Helm chart:

    helm install nginx-ingress stable/nginx-ingress
    
  2. Access your NodeBalancer’s assigned external IP address.

    kubectl -n default get svc -o wide nginx-ingress-controller
    

    The command will return a similar output to the following:

      
    NAME                       TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)                      AGE   SELECTOR
    nginx-ingress-controller   LoadBalancer   10.128.41.200   192.0.2.0      80:30889/TCP,443:32300/TCP   59s   app.kubernetes.io/component=controller,app=nginx-ingress,release=nginx-ingress
        
    
  3. Copy the IP address of the EXTERNAL IP field and navigate to Linode’s DNS Manager and create an A record using this external IP address and a hostname value corresponding to the subdomain you plan to use with your domain.

Now that your NGINX Ingress Controller has been deployed and your domain’s A record has been updated, you are ready to enable HTTPS on your monitoring interfaces.

Install cert-manager

Note

Before performing the commands in this section, ensure that your DNS has had time to propagate across the internet. You can query the status of your DNS by using the following command, substituting example.com for your domain (including a subdomain if you have configured one).

dig +short example.com

If successful, the output should return the IP address of your NodeBalancer.

  1. Install cert-manager’s CRDs.

    kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.15.2/cert-manager.crds.yaml
    
  2. Add the Helm repository which contains the cert-manager Helm chart.

    helm repo add jetstack https://charts.jetstack.io
    
  3. Update your Helm repositories.

    helm repo update
    
  4. Install the cert-manager Helm chart. These basic configurations should be sufficient for many use cases, however, additional cert-manager configurable parameters can be found in cert-manager’s official documentation.

    helm install \
    cert-manager jetstack/cert-manager \
    --namespace cert-manager \
    --version v0.15.2
    
  5. Verify that the corresponding cert-manager pods are running and ready.

    kubectl -n cert-manager get pods
    

    You should see a similar output:

      
    NAME                                       READY   STATUS    RESTARTS   AGE
    cert-manager-749df5b4f8-mc9nj              1/1     Running   0          19s
    cert-manager-cainjector-67b7c65dff-4fkrw   1/1     Running   0          19s
    cert-manager-webhook-7d5d8f856b-4nw9z      1/1     Running   0          19s
        
    

Create a ClusterIssuer Resource

Now that cert-manager is installed and running on your cluster, you will need to create a ClusterIssuer resource which defines which CA can create signed certificates when a certificate request is received. A ClusterIssuer is not a namespaced resource, so it can be used by more than one namespace.

  1. Using the text editor of your choice, create a file named acme-issuer-prod.yaml with the example configurations, replacing the value of email with your own email address for the ACME challenge:

    ~/lke-monitor/acme-issuer-prod.yaml
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    
    apiVersion: cert-manager.io/v1alpha2
    kind: ClusterIssuer
    metadata:
      name: letsencrypt-prod
    spec:
      acme:
        email: [email protected]
        server: https://acme-v02.api.letsencrypt.org/directory
        privateKeySecretRef:
          name: letsencrypt-secret-prod
        solvers:
        - http01:
            ingress:
              class: nginx
        
    • This manifest file creates a ClusterIssuer resource that will register an account on an ACME server. The value of spec.acme.server designates Let’s Encrypt’s production ACME server, which should be trusted by most browsers.

      Note
      Let’s Encrypt provides a staging ACME server that can be used to test issuing trusted certificates, while not worrying about hitting Let’s Encrypt’s production rate limits. The staging URL is https://acme-staging-v02.api.letsencrypt.org/directory.
    • The value of privateKeySecretRef.name provides the name of a secret containing the private key for this user’s ACME server account (this is tied to the email address you provide in the manifest file). The ACME server will use this key to identify you.

    • To ensure that you own the domain for which you will create a certificate, the ACME server will issue a challenge to a client. cert-manager provides two options for solving challenges, http01 and DNS01. In this example, the http01 challenge solver will be used and it is configured in the solvers array. cert-manager will spin up challenge solver Pods to solve the issued challenges and use Ingress resources to route the challenge to the appropriate Pod.

  2. Create the ClusterIssuer resource:

    kubectl apply -f ~/lke-monitor/acme-issuer-prod.yaml
    

Create a Certificate Resource

After you have a ClusterIssuer resource, you can create a Certificate resource. This will describe your x509 public key certificate and will be used to automatically generate a CertificateRequest which will be sent to your ClusterIssuer.

  1. Using the text editor of your choice, create a file named certificate-prod.yaml with the example configurations:

    Note
    Replace the value of spec.dnsNames with the domain, including subdomains, that you will use to host your monitoring interfaces.
    ~/lke-monitor/certificate-prod.yaml
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    
    apiVersion: cert-manager.io/v1alpha2
    kind: Certificate
    metadata:
      name: prometheus-operator-prod
      namespace: monitoring
    spec:
      secretName: letsencrypt-secret-prod
      duration: 2160h # 90d
      renewBefore: 360h # 15d
      issuerRef:
        name: letsencrypt-prod
        kind: ClusterIssuer
      dnsNames:
      - example.com
        
    Note
    The configurations in this example create a Certificate in the monitoring namespace that is valid for 90 days and renews 15 days before expiry.
  2. Create the Certificate resource:

    kubectl apply -f ~/lke-monitor/certificate-prod.yaml
    
  3. Verify that the Certificate has been successfully issued:

    kubectl -n monitoring get certs
    

    When your certificate is ready, you should see a similar output:

      
    NAME          READY   SECRET                    AGE
    lke-monitor   True    letsencrypt-secret-prod   33s
        
    

Next, you will create the necessary resources for basic authentication of the Prometheus and Alertmanager interfaces.

Configure Basic Auth Credentials

In this section, you will use htpasswd to generate credentials for basic authentication and create a Kubernetes Secret, which will then be applied to your Ingress configuration to secure access to your monitoring interfaces.

  1. Create a basic authentication password file for the user admin:

    htpasswd -c ~/lke-monitor/auth admin
    

    Follow the prompts to create a secure password, then store your password securely for future reference.

  2. Create a Kubernetes Secret for the monitoring namespace using the file you created above:

    kubectl -n monitoring create secret generic basic-auth --from-file=auth
    
  3. Verify that the basic-auth secret has been created on your LKE cluster:

    kubectl -n monitoring get secret basic-auth
    

    You should see a similar output to the following:

      
    NAME         TYPE     DATA   AGE
    basic-auth   Opaque   1      81s
        
    

All the necessary components are now in place to be able to enable HTTPS on your monitoring interfaces. In the next section, you will complete the steps needed to deploy Prometheus Operator.

Deploy or Upgrade Prometheus Operator

In this section, you will create a Helm chart values file and use it to deploy Prometheus Operator to your LKE cluster.

  1. Using the text editor of your choice, create a file named values-https-basic-auth.yaml in the ~/lke-monitor directory and save it with the configurations below. Since the control plane is Linode-managed, as part of this step we will also disable metrics collection for the control plane component:

    Note
    Replace all instances of example.com below with the domain you have configured, including subdomains, for use with this guide.
    Caution
    The below configuration will establish persistent data storage with three separate 10GiB Block Storage Volumes for Prometheus, Alertmanager, and Grafana. Because the Prometheus Operator deploys as StatefulSets, these Volumes and their associated Persistent Volume resources must be deleted manually if you later decide to tear down this Helm release.
    ~/lke-monitor/values-https-basic-auth.yaml
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    
    # Helm chart values for Prometheus Operator with HTTPS and basic auth
    prometheus:
      ingress:
        enabled: true
        annotations:
          kubernetes.io/ingress.class: nginx
          nginx.ingress.kubernetes.io/rewrite-target: /$2
          cert-manager.io/cluster-issuer: letsencrypt-prod
          nginx.ingress.kubernetes.io/auth-type: basic
          nginx.ingress.kubernetes.io/auth-secret: basic-auth
          nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required'
        hosts:
        - example.com
        paths:
        - /prometheus(/|$)(.*)
        tls:
        - secretName: lke-monitor-tls
          hosts:
          - example.com
      prometheusSpec:
        routePrefix: /
        externalUrl: https://example.com/prometheus
        storageSpec:
          volumeClaimTemplate:
            spec:
              storageClassName: linode-block-storage-retain
              resources:
                requests:
                  storage: 10Gi
    
    alertmanager:
      ingress:
        enabled: true
        annotations:
          kubernetes.io/ingress.class: nginx
          nginx.ingress.kubernetes.io/rewrite-target: /$2
          cert-manager.io/cluster-issuer: letsencrypt-prod
          nginx.ingress.kubernetes.io/auth-type: basic
          nginx.ingress.kubernetes.io/auth-secret: basic-auth
          nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required'
        hosts:
        - example.com
        paths:
        - /alertmanager(/|$)(.*)
        tls:
        - secretName: lke-monitor-tls
          hosts:
          - example.com
      alertmanagerSpec:
        routePrefix: /
        externalUrl: https://example.com/alertmanager
        storage:
          volumeClaimTemplate:
            spec:
              storageClassName: linode-block-storage-retain
              resources:
                requests:
                  storage: 10Gi
    
    grafana:
      persistence:
        enabled: true
        storageClassName: linode-block-storage-retain
        size: 10Gi
      ingress:
        enabled: true
        annotations:
          kubernetes.io/ingress.class: nginx
          nginx.ingress.kubernetes.io/rewrite-target: /$2
          nginx.ingress.kubernetes.io/auth-type: basic
          nginx.ingress.kubernetes.io/auth-secret: basic-auth
          nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required'
        hosts:
        - example.com
        path: /grafana(/|$)(.*)
        tls:
        - secretName: lke-monitor-tls
          hosts:
          - example.com
      grafana.ini:
        server:
          domain: example.com
          root_url: "%(protocol)s://%(domain)s/grafana/"
          enable_gzip: "true"
    
    # Disable control plane metrics
    kubeEtcd:
      enabled: false
    
    kubeControllerManager:
      enabled: false
    
    kubeScheduler:
      enabled: false
        
  2. Export an environment variable to store your Grafana admin password:

    Note
    Replace prom-operator in the below command with a secure password and save the password for later reference.
    export GRAFANA_ADMINPASSWORD="prom-operator"
    
  3. Using Helm, deploy a Prometheus Operator release labeled lke-monitor in the monitoring namespace on your LKE cluster with the settings established in your values-https-basic-auth.yaml file:

    Note
    If you have already deployed a Prometheus Operator release, you can upgrade it by replacing helm install with helm upgrade in the below command.
    helm install \
    lke-monitor stable/prometheus-operator \
    -f ~/lke-monitor/values-https-basic-auth.yaml \
    --namespace monitoring \
    --set grafana.adminPassword=$GRAFANA_ADMINPASSWORD
    

    Once completed, you will see output similar to the following:

      
    NAME: lke-monitor
    LAST DEPLOYED: Mon Jul 27 17:03:46 2020
    NAMESPACE: monitoring
    STATUS: deployed
    REVISION: 1
    NOTES:
    The Prometheus Operator has been installed. Check its status by running:
      kubectl --namespace monitoring get pods -l "release=lke-monitor"
    
    Visit https://github.com/coreos/prometheus-operator for instructions on how
    to create & configure Alertmanager and Prometheus instances using the Operator.
    
    

  4. Verify that the Prometheus Operator has been deployed to your LKE cluster and its components are running and ready by checking the pods in the monitoring namespace:

    kubectl -n monitoring get pods
    

    You should see a similar output to the following, confirming that you are ready to access your monitoring interfaces using your domain:

      
    NAME                                                     READY   STATUS    RESTARTS   AGE
    alertmanager-lke-monitor-prometheus-ope-alertmanager-0   2/2     Running   0          45s
    lke-monitor-grafana-84cbb54f98-7gqtk                     2/2     Running   0          54s
    lke-monitor-kube-state-metrics-68c56d976f-n587d          1/1     Running   0          54s
    lke-monitor-prometheus-node-exporter-6xt8m               1/1     Running   0          53s
    lke-monitor-prometheus-node-exporter-dkc27               1/1     Running   0          53s
    lke-monitor-prometheus-node-exporter-pkc65               1/1     Running   0          53s
    lke-monitor-prometheus-ope-operator-f87bc9f7c-w56sw      2/2     Running   0          54s
    prometheus-lke-monitor-prometheus-ope-prometheus-0       3/3     Running   1          35s
        
    

Access Monitoring Interfaces from your Domain

Your monitoring interfaces are now publicly accessible with HTTPS and basic auth from the domain you have configured for use with this guide at the following paths:

ResourceDomain and path
Prometheusexample.com/prometheus
Alertmanagerexample.com/alertmanager
Grafanaexample.com/grafana

When accessing an interface for the first time, log in as admin with the password you configured for basic auth credentials.

When accessing the Grafana interface, you will then log in again as admin with the password you exported as $GRAFANA_ADMINPASSWORD on your local environment. The Grafana dashboards are accessible at Dashboards > Manage from the left navigation bar.

More Information

You may wish to consult the following resources for additional information on this topic. While these are provided in the hope that they will be useful, please note that we cannot vouch for the accuracy or timeliness of externally hosted materials.

Join our Community

Find answers, ask questions, and help others.

comments powered by Disqus

This guide is published under a CC BY-ND 4.0 license.