[Solved-Partially] Kubernetes HTTP01 Challenge Seperate Namespace

Question

[Solved-Partially] Kubernetes HTTP01 Challenge Seperate Namespace

none

serviceme 5 years, 5 months ago

Hey folks, I am trying to get SSL working on my Kubernetes cluster.

I am following:
https://www.linode.com/docs/kubernetes/how-to-configure-load-balancing-with-tls-encryption-on-a-kubernetes-cluster/#create-a-tls-certificate-using-cert-manager

*ingress controller is deployed on namespace default
*application is installed in namespace app01
*ingress object is deployed to namespace app01
*confirmed without tls related and cert manager http traffic works

relevant ingress portions

metadata:
  ...
  annotations:
     kubernetes.io/ingress.class: nginx
     cert-manager.io/cluster-issuer: "letsencrypt-prod"
     cert-manager.ioacme-challenge-type: http01
spec:
  tls:
  - hosts:
    - redact.com
    - redact.com
  secretName: redact-tls

However, when I do a describe challenge in the app01 namespace, I see they are all failing to perform HTTP-01 challenge propogation.

Thinking it's maybe similar to what I am seeing here: https://www.digitalocean.com/community/questions/how-do-i-correct-a-connection-timed-out-error-during-http-01-challenge-propagation-with-cert-manager

TLDR:
controller on default namespace
application on app01 namespace
ingress object with TLS info deployed to app01 namespace
http-01 challenge failing

3 Replies

rl0nergan · Answer 1 · Sept. 8, 2020, 11:47 p.m.

rl0nergan 5 years, 5 months ago

Hey @serviceme,

I did some testing of this myself, but I was unable to recreate the connection timeout issues you're seeing. I configured my environment like this:

My Nginx Ingress Controller was deployed in the default namespace
I deployed the Nginx demo application included in the guide you mentioned to the app01 Namespace.
I deployed the Ingress to the app01 Namespace.

I don't believe the Digital Ocean post is related, as there's no need to set a hostname on an LKE Cluster's NodeBalancer. If possible, it might be helpful to post some of the errors from your cert-manager logs to see if we can get a better idea of the root cause.

Regards,
Ryan L.
Linode Support Staff

serviceme · Answer 2 · Sept. 9, 2020, 1:14 a.m.

serviceme 5 years, 5 months ago

EDIT
Solved, partially (I don't have a clear answer as how to fix on the original approach).

So, root cause here was I was using Nginx's Ingress controller versus the Kubernetes maintained Nginx controller.

Kub's maintained version creates a default service that routes HTTP traffic appropriately to the challenges that are deployed. Nginx's version does not.

If I deploy out another cluster at some point, I'll look into this more, but I think what's happening is with Nginx controller, default backend is essentially HTTPS and there is something odd happening with cert manager causing the TLS handshake to fail on the get so certs are never generated.

I think something similar to what is being done on the ingress here would need to be done https://medium.com/containerum/how-to-launch-nginx-ingress-and-cert-manager-in-kubernetes-55b182a80c8f to handle the port 80 call.

Thanks @rl0nergan for the reply. Hopefully I am not doing anything stupid here (which I wouldn't put past me ;)). Let me know if there are other logs I can provide that may be helpful.

As a note, my domains are managed via Namecheap and not imported to the DNS manager on Linode (I am assuming this isn't an issue here). My A records are pointing the Node Balancer external IP. Additional note I am using latest cert bot referenced here https://cert-manager.io/docs/installation/kubernetes/ versus the 0.15 referenced on the Linode article.

From my understanding of the error, the challenge is issued via the pods and external traffic can't hit the temporary pod (so we fail and don't proceed to hit Letsencrypt's servers) [https://cert-manager.io/docs/faq/acme/]

kubectl get pods -n app01

NAME                        READY   STATUS    RESTARTS   AGE
app01core                   1/1     Running   0          21h
cm-acme-http-solver-29sqz   1/1     Running   0          18h
cm-acme-http-solver-cgds7   1/1     Running   0          18h
cm-acme-http-solver-lt7q7   1/1     Running   0          18h
cm-acme-http-solver-zqd86   1/1     Running   0          18h

kubectl describe ingress -n app01
I noticed I have no nginx-ingress-default-backend on my default svc's. Installed nginx controller via https://docs.nginx.com/nginx-ingress-controller/installation/installation-with-helm/ versus the Kubernetes maintained Nginx Ingress (helm command was helm install nginx-ingress stable/nginx-ingress --set controller.publishService.enabled=true)… I am exploring this as I think this might be part of the root cause -- I saw some other website mentioning handshake issues for another problem that was caused by default backend coming in HTTPS or something like that.

...

Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>)

...

kubectl describe challenges -n app01

...

  Reason:      Waiting for HTTP-01 challenge propagation: failed to perform self check GET request 'http://redact/.well-known/acme-challenge/redact': Get "https://redact:443/.well-known/acme-challenge/redact": remote error: tls: handshake failure
  State:       pending

...

kubectl logs cert-manager-5bc6c5cb94-22hfb -n cert-manager

E0909 01:12:54.634692       1 sync.go:183] cert-manager/controller/challenges "msg"="propagation check failed" "error"="failed to perform self check GET request 'http://www.redact.com/.well-known/acme-challenge/redact': Get \"https://www.redact.com:443/.well-known/acme-challenge/redact\": remote error: tls: handshake failure" "dnsName"="www.redact.com" "resource_kind"="Challenge" "resource_name"="app01-tls-7l84x-1528030512-2246324353" "resource_namespace"="app01" "resource_version"="v1" "type"="HTTP-01"

ingress object

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: app01
  namespace: app01
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
    cert-manager.io/acme-challenge-type: http01
spec:
  tls:
  - hosts:
    - redact.video
    - www.redact.video
    - redact.com
    - www.redact.com
    secretName: redact-tls
  rules:
  - host: redact.video
    http:
      paths:
      - backend:
          serviceName: app01-core
          servicePort: 8000
  - host: www.redact.video
    http:
      paths:
      - backend:
          serviceName: app01-core
          servicePort: 8000
  - host: redact.com
    http:
      paths:
      - backend:
          serviceName: app01-core
          servicePort: 8000
  - host: www.redact.com
    http:
      paths:
      - backend:
          serviceName: app01-core
          servicePort: 8000

clusterissuer

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    email: redact@gmail.com
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-secret-prod
    solvers:
    - http01:
        ingress:
          class: nginx

and the actual pod being deployed to app01 (a containerized flask app with gunicorn)

apiVersion: v1
kind: Pod
metadata:
  name: app01core
  namespace: app01
  labels:
    app: app01core
spec:
  containers:
    - name: main-app-container
      image: redact.azurecr.io/redact/core_app:latest
      imagePullPolicy: IfNotPresent
      env:
        - name: SECRET_KEY
          valueFrom:
            secretKeyRef:
              name: environment
              key: SECRET_KEY
        - name: RECAPTCHA_PUB
          valueFrom:
            secretKeyRef:
              name: environment
              key: RECAPTCHA_PUB
        - name: RECAPTCHA_PRV
          valueFrom:
            secretKeyRef:
              name: environment
              key: RECAPTCHA_PRV
        - name: SENDGRID_KEY
          valueFrom:
            secretKeyRef:
              name: environment
              key: SENDGRID_KEY
        - name: SENDGRID_SENDER
          valueFrom:
            secretKeyRef:
              name: environment
              key: SENDGRID_SENDER
      ports:
        - containerPort: 8000
  imagePullSecrets:
    - name: acr-secret

badrdouah · Answer 3 · Aug. 27, 2022, 12:22 a.m.

badrdouah 3 years, 5 months ago

Solved by uninstalling nginx ingress with helm and installing it again on the namespace 'default'

Compute

Storage

Networking

Databases

Services

Solutions

Pricing

Library

Technical Resources

Community

Marketplace

What's New

[Solved-Partially] Kubernetes HTTP01 Challenge Seperate Namespace

3 Replies

Reply

Tips: