Deploying postgres databasa with csi volumes

Hello I am trying to create postgres database in its own namespace and attach PersistentVolume on it.
I created a cluster with LKE so I have csi driver already installed.
Also secret postgres-credentials is also created.
This is my yaml file for database

# Persistent Volume Claim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  namespace: postgres
  name: postgres-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: linode-block-storage-retain
# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: postgres
  name: postgres-deployment
spec:
  selector:
    matchLabels:
      app: postgres-container
  template:
    metadata:
      labels:
        app: postgres-container
    spec:
      containers:
        - name: postgres-container
          image: postgres:9.6.6
          env:
            - name: POSTGRES_USER
              valueFrom:
                secretKeyRef:
                  name: postgres-credentials
                  key: POSTGRES_USER

            - name: POSTGRES_DB
              valueFrom:
                secretKeyRef:
                  name: postgres-credentials
                  key: POSTGRES_DB

            - name: POSTGRES_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgres-credentials
                  key: POSTGRES_PASSWORD

          ports:
            - containerPort: 5432
          volumeMounts:
            - mountPath: /var/lib/postgresql/data
              name: postgres-volume-mount
      volumes:
        - name: postgres-volume-mount
          persistentVolumeClaim:
            claimName: postgres-pvc
apiVersion: v1
kind: Service
metadata:
  namespace: postgres
  name: postgres-service
spec:
  selector:
    app: postgres-container
  ports:
    - port: 5432
      protocol: TCP
      targetPort: 5432
  type: NodePort

When i go to my cloud linode dashboard I see volume is created and eveything seems fine.
This are some events in postgres pod that is showing me error.

MountVolume.MountDevice failed for volume "pvc-aa9e0765c2c74cb7" : rpc error: code = Internal desc = Unable to find device path out of attempted paths: [/dev/disk/by-id/linode-pvcaa9e0765c2c74cb7 /dev/disk/by-id/scsi-0Linode_Volume_pvcaa9e0765c2c74cb7]
Unable to attach or mount volumes: unmounted volumes=[postgres-volume-mount], unattached volumes=[postgres-volume-mount default-token-mgvtv]: timed out waiting for the condition

14 Replies

Hey there -

This is a tough one, but I've come across this situation before so I want to give a couple of things to look into.

One thing to look for is to make sure that your Volume is only being mounted on a single container. If there are multiple containers attempting to mount it, the job would fail.

I've also seen this happen as a result of a syntax error. My recommendation is to go through your manifest to make sure everything is formatted correctly (no extra spaces, tabs, or anything like that).

I hope this helps!

Did you find a resolution to this? I'm having a similar problem at the moment:

AttachVolume.Attach succeeded for volume "pvc-a1b6aa5..."
MountVolume.MountDevice failed for volume "pvc-a1b6aa5..." : rpc error: code = Internal desc = Unable to find device path out of attempted paths: [/dev/disk/by-id/linode-pvca1b6aa53... /dev/disk/by-id/scsi-0Linode_Volume_pvca1b6aa53...]

Did you find a resolution to this? I'm having a similar problem at the moment.

MountVolume.MountDevice failed for volume "pvc-d23fbce33cee4fa7" : rpc error: code = Internal desc = Unable to find device path out of attempted paths: [/dev/disk/by-id/linode-pvcd23fbce33cee4fa7 /dev/disk/by-id/scsi-0Linode_Volume_pvcd23fbce33cee4fa7]

One thing to look for is to make sure that your Volume is only being mounted on a single container. If there are multiple containers attempting to mount it, the job would fail.

RWO volumes can be mounted by multiple pods, as long as they're on the same node, right?

I started seeing this error when migrating applications to a new node pool. In at least several cases, I was able to fix it by manually detaching and then reattaching the volume to the node via the https://cloud.linode.com/volumes UI. (Whether or not it was safe to do, I'm not sure.)

I have the same problem, using the exact same statements in the yaml files, are there any solutions for this ?

This will work, but you might need to wait for the first mount to fail, which can take 10 minutes.

Simply delete the VolumeAttachment object in Kubernetes, OR from the Linode Cloud Manager UI detach the volume. Then recreate the pod and be patient for around 10 minutes.

This is obviously not great if you're running a high volume production application, but in that case it's best not to run your database on Kubernetes.

Experiencing the same issue here too, same conditions, but not with postgres.

Same issue here. Re-installing/Upgrading/Redeploying the wordpress app results in the same error:

MountVolume.MountDevice failed for volume "pvc-db413a06bd404b84" : rpc error: code = Internal desc = Unable to find device path out of attempted paths: [/dev/disk/by-id/linode-pvcdb413a06bd404b84 /dev/disk/by-id/scsi-0Linode_Volume_pvcdb413a06bd404b84]

Getting tired of these Volume issues to be honest.

My Postgres app redeployed and was assigned a new node. The volume got automatically detached and attached to that new node. Container/Pod failed to start with Mounting errors. I redeployed the Pod back onto it's original Node. Volume was successfully mounted back to the old Node, but the Pod/Container still won't mount:

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 30m (x9 over 50m) kubelet Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[dshm data kube-api-access-n6t5h]: timed out waiting for the condition
Warning FailedMount 16m (x2 over 25m) kubelet Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[data kube-api-access-n6t5h dshm]: timed out waiting for the condition
Warning FailedMount 12m (x3 over 39m) kubelet Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[kube-api-access-n6t5h dshm data]: timed out waiting for the condition
Warning FailedMount 92s (x33 over 52m) kubelet MountVolume.MountDevice failed for volume "pvc-19d050b1a14040c6" : rpc error: code = Internal desc = Unable to find device path out of attempted paths: [/dev/disk/by-id/linode-pvc19d050b1a14040c6 /dev/disk/by-id/scsi-0Linode_Volume_pvc19d050b1a14040c6]<

PVC description

Name: data-fanzy-postgresql-dev-0
Namespace: fanzy-dev
StorageClass: linode-block-storage-retain
Status: Bound
Volume: pvc-19d050b1a14040c6
Labels: app.kubernetes.io/component=primary
app.kubernetes.io/instance=fanzy-postgresql-dev
app.kubernetes.io/name=postgresql
Annotations: pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
volume.beta.kubernetes.io/storage-provisioner: linodebs.csi.linode.com
volume.kubernetes.io/storage-provisioner: linodebs.csi.linode.com
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 10Gi
Access Modes: RWO
VolumeMode: Filesystem
Used By: fanzy-postgresql-dev-0
Events: <none></none>

PV description

Name: pvc-19d050b1a14040c6
Labels: <none>
Annotations: pv.kubernetes.io/provisioned-by: linodebs.csi.linode.com
Finalizers: [kubernetes.io/pv-protection external-attacher/linodebs-csi-linode-com]
StorageClass: linode-block-storage-retain
Status: Bound
Claim: fanzy-dev/data-fanzy-postgresql-dev-0
Reclaim Policy: Retain
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 10Gi
Node Affinity: <none>
Message:
Source:
Type: CSI (a Container Storage Interface (CSI) volume source)
Driver: linodebs.csi.linode.com
FSType: ext4
VolumeHandle: 516140-pvc19d050b1a14040c6
ReadOnly: false
VolumeAttributes: storage.kubernetes.io/csiProvisionerIdentity=1662712251649-8081-linodebs.csi.linode.com
Events: <none></none></none></none>

Now I've noticed that even newly created PVC are failing to get attached to new pods/containers with the same error.

I ran through this example (https://github.com/linode/linode-blockstorage-csi-driver#create-a-kubernetes-secret) and reinstalled the drivers. PVC gets successfully created but fails to mount.

kubectl get pvc/csi-example-pvc pods/csi-example-pod

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/csi-example-pvc Bound pvc-c0ea8df9e5684244 10Gi RWO linode-block-storage-retain 21m

NAME READY STATUS RESTARTS AGE
pod/csi-example-pod 0/1 ContainerCreating 0 21m

Here's the error description from the pod:

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14m default-scheduler Successfully assigned default/csi-example-pod to lke71838-112699-635487f6efa8
Warning FailedMount 3m38s kubelet Unable to attach or mount volumes: unmounted volumes=[csi-example-volume], unattached volumes=[kube-api-access-zvksd csi-example-volume]: timed out waiting for the condition
Warning FailedMount 83s (x5 over 12m) kubelet Unable to attach or mount volumes: unmounted volumes=[csi-example-volume], unattached volumes=[csi-example-volume kube-api-access-zvksd]: timed out waiting for the condition
Warning FailedAttachVolume 14s (x7 over 12m) attachdetach-controller AttachVolume.Attach failed for volume "pvc-c0ea8df9e5684244" : Attach timeout for volume 802990-pvcc0ea8df9e5684244

I think you are seeing this strange behavior because of the deployment strategy used for the deployment with persistent volume and linode-block-storage-retain storageclass.

You need to change the strategy for deployment to Recreate. By default, it uses RollingUpdate.

apiVersion: apps/v1
kind: Deployment
...
spec:
  strategy:
    type: Recreate
...

The difference between the Recreate and RollingUpdate is that Recreate strategy will terminate the old pod before creating new one while RollingUpdate will create new pod before terminating the old one. If you are not using persistent volumes, then any strategy is fine. But with persistent volumes which are supposed to attach to only one pod, if the old one is not terminated, the new one will fail to come up and will remain in ContainerCreating state waiting for the storage to show up. There can be different inconsistent results because of this behavior.
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy

For statefulsets, when using Rolling Updates with the default Pod Management Policy (OrderedReady), it's possible to get into a broken state that requires manual intervention to repair. Please check the limitations section of k8s docs for more information: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#limitations

Also, if you are changing something on cloud manager UI which was auto-generated by k8s, then you might end up with weird issues. Your k8s might be trying to use the name/label which it gave when provisioning the resource and might not find it as the label was later changed from the UI. I once updated the label for my volume on cloud manager UI and k8s failed to identify that as PV within k8s was still referring to the old auto-generated label. I had to clean up things to get it fixed.

I got a similar problem when stateful sets were rescheduled on a different node when the cluster size was reduced…

Events:
  Type     Reason                  Age                   From                     Message
  ----     ------                  ----                  ----                     -------
  Normal   Scheduled               37m                   default-scheduler        Successfully assigned default/mongo-0 to lkeABC-DEF-XYZ
  Warning  FailedAttachVolume      37m                   attachdetach-controller  Multi-Attach error for volume "pvc-XYZ" Volume is already exclusively attached to one node and can't be attached to another
  Normal   SuccessfulAttachVolume  36m                   attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-XYZ"
  Warning  FailedMount             16m (x2 over 21m)     kubelet                  Unable to attach or mount volumes: unmounted volumes=[mongo-data3], unattached volumes=[kube-api-access-5r75k mongo-data3]: timed out waiting for the condition
  Warning  FailedMount             6m16s (x23 over 36m)  kubelet                  MountVolume.MountDevice failed for volume "pvc-XYZ" : rpc error: code = Internal desc = Unable to find device path out of attempted paths: [/dev/disk/by-id/linode-pvcXYZ /dev/disk/by-id/scsi-0Linode_Volume_pvcXYZ]
  Warning  FailedMount             66s (x13 over 35m)    kubelet                  Unable to attach or mount volumes: unmounted volumes=[mongo-data3], unattached volumes=[mongo-data3 kube-api-access-5r75k]: timed out waiting for the condition

Is there a recommended configuration for stateful sets to avoid this multi-attach then repeated FailedMount and pod not starting issue?

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct