Kubernetes dynamic volume provisioning using Ceph as storage backend

Introduction

As you might already known, in Kubernetes we can use Persisten Volumes (PV) for the Pod storage resource. PV is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using StorageClasses.

By using StorageClass we can provision volumes dynamically. There are several supported storage back-end, such as: AzureDisk, AWSElasticBlockStore, GCEPersistentDisk, Ceph, NFS, etc. In this blog post, I am gonna show the steps to use Ceph as the storage back-end for a Kubernetes using dynamic volume provisioning.

Prerequirements

First of all, you need a working Ceph cluster. If you are looking for a tutorial to set up a Ceph cluster, take a look at my previous blog post Deploy Ceph storage cluster on Ubuntu server.

And of course, you will need a Kubernetes cluster as well. It can be a managed one from the cloud providers like AWS, Azure or GCP. It could also be your self-managed Kubernetes cluster using kubeadmin.

Before we begin, let’s check the status of our clusters first.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ ceph status
cluster:
id: be6300b0-eb01-4619-8684-40b6d485f94f
health: HEALTH_OK

services:
mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 2d)
mgr: ceph1(active, since 2d)
mds: cephfs:1 {0=ceph1=up:active} 2 up:standby
osd: 3 osds: 3 up (since 2d), 3 in (since 2d)

data:
pools: 3 pools, 24 pgs
objects: 65 objects, 140 MiB
usage: 3.4 GiB used, 87 GiB / 90 GiB avail
pgs: 24 active+clean
1
2
3
4
5
6
7
8
9
10
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8smaster Ready master 4d12h v1.17.3
k8sworker Ready <none> 4d12h v1.17.3

$ kubectl get componentstatus
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}

Configuration

Ceph pool and access privileges

We need a dedicated Ceph pool which will be used for Kubernetes volume creation. The following command will create a pool named kube.

1
$ ceph osd pool create kube 8

If you enabled the authentication in your Ceph cluster config, you have to create an user for Kubernetes nodes to access the pool. The following command create a client.kube user which has neccessary previleges.

1
$ ceph auth get-or-create client.kube mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=kube' -o ceph.client.kube.keyring

Now get the client.kube user key

1
2
3
4
5
6
$ ceph auth get client.kube
exported keyring for client.kube
[client.kube]
key = AQC8cGBel0KsDxAAi4feaaavfXybD3EaB1GHjQ==
caps mon = "allow r"
caps osd = "allow class-read object_prefix rbd_children, allow rwx pool=kube"

We also need the admin key which will be used for Kubernetes’s Ceph provisioner operation. Using the admin key, the provisioner will be able to create volumes inside the pool. I assume that the admin user is client.admin

1
2
3
4
5
6
7
8
$ ceph auth get client.admin
exported keyring for client.admin
[client.admin]
key = AQCmZ2BetvWXDxAA7Cm23C60qGws6FaxV6yQ4g==
caps mds = "allow *"
caps mgr = "allow *"
caps mon = "allow *"
caps osd = "allow *"

Kubernetes Secret config

Following is the YAML file for the Secret resource which will be used for Ceph authentication from Kubernetes cluster. The key value is base64 encoded from the keys in previous step.

ceph-user-secret.yaml
1
2
3
4
5
6
7
8
apiVersion: v1
kind: Secret
metadata:
name: ceph-user-secret
namespace: kube-system
data:
key: QVFDOGNHQmVsMEtzRHhBQWk0ZmVhYWF2Zlh5YkQzRWFCMUdIalE9PQ==
type: kubernetes.io/rbd
ceph-admin-secret.yaml
1
2
3
4
5
6
7
8
apiVersion: v1
kind: Secret
metadata:
name: ceph-admin-secret
namespace: kube-system
data:
key: QVFDbVoyQmV0dldYRHhBQTdDbTIzQzYwcUd3czZGYXhWNnlRNGc9PQ==
type: kubernetes.io/rbd

Create Secret resources using kubectl command

1
2
$ kubectl create -f ceph-user-secret.yaml
$ kubectl create -f ceph-admin-secret.yaml

Kubernetes StorageClass config

Following is the YAML file for the StorageClass resource which will be used for Kubernetes Ceph provisioner to access the Ceph cluster.

ceph-storageclass.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
name: ceph
annotations:
provisioner: kubernetes.io/rbd
parameters:
monitors: 172.17.30.11:6789,172.17.30.12:6789,172.17.30.13:6789
adminId: admin
adminSecretName: ceph-admin-secret
adminSecretNamespace: kube-system
pool: kube
userId: kube
userSecretName: ceph-user-secret

Where:

  • monitors: the list of your Ceph-mon nodes. It can be a single node or multiple nodes separated by comma
  • pool: the Ceph pool name created in the previous step

Create StorageClass resource using kubectl command

1
$ kubectl create -f ceph-storageclass.yaml

Verify the recently added StorageClass

1
2
3
$ kubectl get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
ceph kubernetes.io/rbd Delete Immediate false 1m

Kubernetes worker node authentication

In order to access to the Ceph cluster, each Kubernetes worker node must hold the ceph.client.kube.keyring key file which generated in the previous step. Make sure you copy it to /etc/ceph directory on each nodes. This file will be read by kubelet process whenever it has a running Pod that requires access to a Persistent Volumes which maps to Ceph StorageClass.

These Kubernetes worker nodes also need Ceph’s common packages to interact with the storage. On all your worker nodes, execute the following command

1
$ sudo apt install ceph-common

PVC creation

Now, it is time to test the creation of a Persistent Volume Clain (VPC) using Ceph StorageClass. Following is the YAML definition for the PVC resource.

ceph-test-pvc.yaml
1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ceph-test-pvc
annotations:
volume.beta.kubernetes.io/storage-class: ceph
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi

Create PVC resource using kubectl command

1
$ kubectl create -f ceph-test-pvc.yaml

Verify the recently added PVC and PV

1
2
3
4
5
6
7
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
ceph-test-pvc Bound pvc-80bc7b2d-f19a-4c73-97ae-62fbdff949ee 1Gi RWO ceph 2m

$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-80bc7b2d-f19a-4c73-97ae-62fbdff949ee 1Gi RWO Delete Bound default/ceph-test-pvc ceph 2m

The Persistent Volume is ready to use, we can now attach it to any Pod as usual.

Known issues

The above steps work fine for the managed Kubernetes cluster. However, if you are using a self-hosted cluster, using kubeadm for example, you might face a problem with missing ceph driver in the Controller Manager.

1
Error: "failed to create rbd image: executable file not found in $PATH, command output:

To fix it, there is a workaround which mentioned on Kubernetes’s Github. Just need to edit the /etc/kubernetes/manifests/kube-controller-manager.yaml file to change…

from

1
image: k8s.gcr.io/kube-controller-manager:v1.17.3

to

1
image: gcr.io/google_containers/hyperkube:v1.17.3

After editing, the kube-controller-manager pod got recreated using new image.

Share Comments