Linode Block Storage CSI Driver

Intro

Block Storage and Container Orchestration had a big year at Linode.

In February 2018, Linode announced Block Storage Volumes. In the 10 months since its introduction, thousands of users have provisioned petabytes of storage through the service. What started in only one region has become available in seven regions with more on the way. This has provided fast access that requires the least user configuration and minimal system overhead.

Enter the Linode Block Storage Cloud Storage Interface driver. With block storage service available in nearly all of Linode's regions, the service is well suited for use by persistent storage claims in container orchestrators like Kubernetes and Mesos. In November 2018, we built on the work of external collaborators (1) to promote the Linode Block Storage CSI driver to Linode's Github organization.

Edit March 14, 2019: us-southeast (Atlanta) does not have BlockStorage support. The CSI driver will fail in this environment with events that contain the error message returned from the Linode API.

What is a CSI?

The Container Storage Interface specification provides an abstract interface that enables any container orchestrator to make persistent storage claims against a multitude of storage backends, including a user's chosen cloud provider. The CSI specification is being adopted broadly, not just at Linode.

The Linode Block Storage CSI adheres to the CSI 1.0.0 specification which was released November 15, 2018. With the introduction of the Linode CSI, Linode users can easily adapt container orchestrators like Kubernetes with the dynamic storage provisioning it offers, for use on Linodes with Block Storage Volumes.

Persistent Storage use in Kubernetes on Linode

When deploying stateful applications in Kubernetes, such as a database (MySQL, PostgresSQL), object stores (Redis, MongoDB), or file stores (NFS), it's desirable to back that service with persistent storage. Pods providing these services can access and share volumes that are externally and dynamically managed by persistent storage interfaces (2).

Prior to the Linode CSI, Linode users deploying Kubernetes clusters traditionally had to make use of the fast local SSD storage through tools like Rook with FlexVolume support. Local storage, even when distributed throughout a cluster, can not be relied on in every case. Storage that can outlive a pod, node, or cluster is a requirement for many environments. With the addition of Linode Block Storage, a second tier of storage could be added to clusters (3). The CSI driver offered us a the clear path forward with persistent storage claims on Linode Block Storage.

Usage

Recommended Installers

To get a Linode integrated Kubernetes experience, we recommend using the linode-cli or the terraform-linode-k8s module directly.

These installers bundle Linode aware addons, like the Cloud Controller Manager (CCM) (4), Container Storage Interface (CSI), and External-DNS. Linode's recommended installers provision clusters that make use of Linodes regional private IP network to avoid high transfer usage. The K8s masters and nodes will be created with Kubernetes node names that match the Linode labels and Linux hostnames.

Installing the CSI

Install the manifest which creates the roles, bindings, sidecars, and the CSI driver necessary to use Linode Block Storage in a Kubernetes cluster.

This step is not necessary if the cluster was created with one of the Linode recommended installers. This will only work with Kubernetes 1.13+.

kubectl apply -f https://raw.githubusercontent.com/linode/linode-blockstorage-csi-driver/master/pkg/linode-bs/deploy/releases/linode-blockstorage-csi-driver-v0.0.3.yaml

Consuming Persistent Volumes

Using the Linode CSI is very simple. The region of dynamically created volume will automatically match the region of the Kubernetes node. There is currently a minimum size requirement of 10GiB and a maximum volume count per node of eight (which includes the local disk(s), e.g. sda, sdb).

Create a manifest named csi-example.yaml, and add it to the cluster using kubectl apply -f csi-example.yaml.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: csi-example-pvc
spec:
  accessModes:
  - ReadWriteOnce   # Currently, the only supported accessMode
  resources:
    requests:
      storage: 10Gi # Linode Block Storage has a minimum of 10GiB
  storageClassName: linode-block-storage

---

apiVersion: v1
kind: Pod
metadata:
  name: csi-example-pod
spec:
  containers:
    - name: csi-example-container
      image: busybox
      volumeMounts:
      - mountPath: "/data"
        name: csi-example-volume
      command: [ "sleep", "1000000" ]
  volumes:
    - name: csi-example-volume
      persistentVolumeClaim:
        claimName: csi-example-pvc  # This must match the metadata name of the desired PVC

Within a minute, the pod should be running. Interact with the storage through the pod:

$ kubectl exec -it csi-example-pod -- /bin/sh -c "echo persistence > /data/example.txt; ls -l /data"
total 20
-rw-r--r--    1 root     root            12 Dec  5 13:06 example.txt
drwx------    2 root     root         16384 Dec  5 06:03 lost+found

Delete the pod and it will be rescheduled with the same volume attached.

Additional Resources

Additional details are available in the linode-blockstorage-csi-driver project's README.md.

Features and Limitations

Features

  • Creates Linode Block Storage Volumes on demand
  • Single Node Writer / Read Write Once types
  • Volumes include topology.linode.com/region annotations
  • Prefix Volume Labels

Design Choices

  • Defines the Linode CSI as the default StorageClass
  • Default ReclaimPolicy deletes unused volumes
  • Waits up to 300s for volume creation
  • Formats volumes as ext4

Limitations

  • Currently requires a minimum volume size of 10GB ($1/mo as of Dec 2018)
  • Currently supports Kubernetes 1.13+
  • Currently supports 7 Block Storage device attachments (see below)
  • Linode Label must match the Kubernetes Node name
  • Snapshots are not supported

Todo

Summary

The CSI enables us to support persistent storage claims for container orchestrators like Kubernetes and those claims enable us to support user's stateful applications. We look forward to 2019, where we plan on providing additional support for container orchestrators.

Footnotes

  1. The development team benefited from the work of AppsCode, Ciaran Liedeman, Ricardo Ramirez (Linode Docker Volume driver) and the Kubernetes Storage SIG contributors in producing early proof-of-concept drivers.

  2. Kubernetes volume plugins began life in-tree, with support for only a few specific cloud providers. Within the Kubernetes community, the Storage and Cloud-Provider Special-Interest-Group (SIG) have been working to remove volume plugins and other cloud specific clode from the core Kubernetes codebase. This resulted in out-of-tree FlexVolume drivers and, most recently, Container Storage Interface (CSI) drivers.

  3. These could even be added through Rook, although there are complexities to this configuration and provisioning the attached volumes would be a manual effort.To reduce complexity and simplify Kubernetes access to Linode's Block Storage, AppsCode's Pharmer team created a FlexVolume plugin for Linode. There were limitations with FlexVolume, such as root access requirements for installation and provisioning.

  4. The CCM informs the Kubernetes API of Node status and removals based on the Linode API. If a Linode is deleted through the manager, the pods from that node will be rescheduled to other nodes and the node will be removed from the cluster. When a service uses type: LoadBalancer, a Linode NodeBalancer will be automatically provisioned and configured to route the traffic from the public IP address of the NodeBalancer to the private IP address any of the nodes capable of providing the service (all nodes, thanks to kube-proxy).

7 Replies

Getting this when trying to install on Kubernetes 1.13:

$ kubectl apply -f https://raw.githubusercontent.com/linode/linode-blockstorage-csi-driver/master/pkg/linode-bs/deploy/releases/linode-blockstorage-csi-driver-v0.0.3.yaml
customresourcedefinition.apiextensions.k8s.io/csinodeinfos.csi.storage.k8s.io created
customresourcedefinition.apiextensions.k8s.io/csidrivers.csi.storage.k8s.io created
serviceaccount/csi-node-sa created
clusterrole.rbac.authorization.k8s.io/driver-registrar-role created
clusterrolebinding.rbac.authorization.k8s.io/driver-registrar-binding created
serviceaccount/csi-controller-sa created
clusterrole.rbac.authorization.k8s.io/external-provisioner-role created
clusterrolebinding.rbac.authorization.k8s.io/csi-controller-provisioner-binding created
clusterrole.rbac.authorization.k8s.io/external-attacher-role created
clusterrolebinding.rbac.authorization.k8s.io/csi-controller-attacher-binding created
clusterrole.rbac.authorization.k8s.io/external-snapshotter-role created
clusterrolebinding.rbac.authorization.k8s.io/csi-controller-snapshotter-binding created
storageclass.storage.k8s.io/linode-block-storage created
statefulset.apps/csi-linode-controller created
daemonset.extensions/csi-linode-node created
error: unable to recognize "https://raw.githubusercontent.com/linode/linode-blockstorage-csi-driver/master/pkg/linode-bs/deploy/releases/linode-blockstorage-csi-driver-v0.0.3.yaml": no matches for kind "CSIDriver" in version "csi.storage.k8s.io/v1alpha1"
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.1", GitCommit:"eec55b9ba98609a46fee712359c7b5b365bdd920", GitTreeState:"clean", BuildDate:"2018-12-13T10:39:04Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.1", GitCommit:"eec55b9ba98609a46fee712359c7b5b365bdd920", GitTreeState:"clean", BuildDate:"2018-12-13T10:31:33Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}

Thanks, @recipedude. The piece of the yaml that is not being accepted follows (the comment block hints at the problem):

---
# pkg/linode-bs/deploy/kubernetes/02-csi-driver.yaml
# Requires CSIDriverRegistry feature gate (alpha in 1.12)
# xref: https://raw.githubusercontent.com/kubernetes/csi-api/master/pkg/crd/manifests/csinodeinfo.yaml
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: csidrivers.csi.storage.k8s.io
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  version: v1alpha1
  group: csi.storage.k8s.io
  names:
    kind: CSIDriver
    plural: csidrivers
  scope: Cluster
  validation:
    openAPIV3Schema:
      properties:
        spec:
          description: Specification of the CSI Driver.
          properties:
            attachRequired:
              description: Indicates this CSI volume driver requires an attach operation,
                and that Kubernetes should call attach and wait for any attach operation
                to complete before proceeding to mount.
              type: boolean
            podInfoOnMountVersion:
              description: Indicates this CSI volume driver requires additional pod
                information (like podName, podUID, etc.) during mount operations.
              type: string

---

According to the list of feature gates, CSIDriverRegistry is not enabled by default in K8s 1.13.1.

Try applying the yaml again after ensuring the required feature gates are enabled with --feature-gates=CSIDriverRegistry=true, CSINodeInfo=true added to your Kubelet and API Server. (These feature flags are enabled in the terraform-linode-k8s module for the API Server and the Kubelet.)

I'll have to update the listed requirements and the versions they are introduced at.

This seemed like a fabulous feature and was working great during initial testing. However, we just ran into the volume attach limit quite quickly when scaling up our test environment. With effectively only 6 PVs per node (8 volume limit minus local SSDs) this seems to be of very limited use in real production situations.

Are there plans to relax this 8 volume constraint?

@inviscid Plans to increase this limit have already been enacted, but the CSI driver hasn't caught up, yet.

Volumes previously had to be bound to one of the Linode Boot Config sda-sdh device slots.

The API has already removed that limitation by permitting a Volume to be attached without a ConfigID reference (persist_across_boots).

https://developers.linode.com/api/docs/v4#operation/attachVolume

These volumes will not be reattached by Linode on reboot (which is not a problem in Kubernetes since the CSI will either reattach the volume on reboot or the volume (and related pods) will have already been rescheduled to another node.


In LinodeGo, persist_across_boots should be added:

https://github.com/linode/linodego/pull/81

The offending lines in the CSI driver are:

https://github.com/linode/linode-blockstorage-csi-driver/blob/58fef3fac1540c26d3584ef5ef4072dbdadd6c62/pkg/linode-bs/nodeserver.go#L37

Since the persist_across_boots value "default is true", for backward compatibility this LinodeGo field will need to be a reference to a boolean and it will need to be explicitly made false here:

https://github.com/linode/linode-blockstorage-csi-driver/blob/58fef3fac1540c26d3584ef5ef4072dbdadd6c62/pkg/linode-bs/controllerserver.go#L227-L234

Excellent! Thanks for the quick response and fix.

We will make use of it as soon as it is available.

We were looking at the documentation and noticed a constraint that we would like to understand a little better.

https://developers.linode.com/api/docs/v4#operation/attachVolume

In there, it states the limit of block device volumes that can be attached to a node is equal to the RAM in GB. Can you help us understand why this constraint is in place?

We certainly understand there needs to be some reasonableness check so someone doesn't attempt attaching 200 volumes to a nanode but some use cases we have require user specific volumes (JupyterHub) which can add up quickly.

Thanks…

@inviscid Your reasoning is accurate. There is an appreciable amount of RAM and CPU overhead involved with attaching and sustaining block storage connections to virtual machines. GCP has similar restrictions in place.

The Kubernetes slack channel - #linode is a great place to give this feedback to the developers (and get their responses).

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct