Storage is the cornerstone of any application. There are only a few applications I can think of that don’t use some kind of storage backend. This is how I decided to use OpenEBS for my Kubernetes workloads. As a quick antecdote from the OpenEBS website, OpenEBS helps Developers and Platform SREs easily deploy Kubernetes Stateful Workloads that require fast and highly reliable container attached storage.
This is great, but there were a few hangups along the way. This writeup is how I configured OpenEBS for my Kubernetes Cluster.
Understanding Kubernetes PersistentVolumes, PersistentVolumeClaims, and StorageClasses
Straight from the Kubernetes Docs we have:
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual Pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.
A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany, see AccessModes).
This means PersistentVolumes are responsible for the physical node storage, and PersistentVolumeClaims are a way for the pod to use the PersistentVolume.
Essentially it looks like this:
Pod -> PVC -> PV -> Host machine
Where does OpenEBS come in? OpenEBS comes in as the StorageClass for the PersistentVolume. Note for my configuration as I’m using nearly the most out of box setup available. I’m using the Local PV Hostpath configuration.
Installing OpenEBS
There were multiple ways to install OpenEBS, but I prefer having the YAML file available to modify.
wget https://openebs.github.io/charts/openebs-operator.yaml
kubectl apply -f openebs-operator.yaml
After the operator was pulled down for OpenEBS, I updated some configurations to only run the OpenEBS functions on two of the nodes in the cluster, as opposed to every node in the cluster. This meant updating the nodeSelector for the Deployments, and also for the DaemonSets. For now, I have basically updated the file to include nodeSelectors on most of the configurations. I also had to label my two storage nodes with the following:
# In the openebs-operator.yaml file
# Added in some node selectors to only use the nodes with the label "storage-node"
nodeSelector:
"openebs.io/nodegroup": "storage-node"
# This is how the nodes are tagged from the command line
kubectl label node <node-name> "openebs.io/nodegroup"="storage-node"
Using the Storage
Using the storage is pretty simple, but I ran into one gotcha regarding pod deployments. To get up and running, it takes a PVC creation, and a pod deployment selecting the correct node.
I ran into the pod deployment attempting to use the local PersistentVolume on the wrong node at one point.
The example PVC Configuration looks like the following, (note the openebs-hostpath
storageClass):
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: local-hostpath-pvc
spec:
storageClassName: openebs-hostpath
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5G
The example Pod Deployment looks like the following (note the nodeSelector
and the persistentVolumeClaim
configurations):
apiVersion: v1
kind: Pod
metadata:
name: hello-local-hostpath-pod
spec:
nodeSelector:
"openebs.io/nodegroup": "storage-node"
volumes:
- name: local-storage
persistentVolumeClaim:
claimName: local-hostpath-pvc
containers:
- name: hello-container
image: busybox
command:
- sh
- -c
- 'while true; do echo "`date` [`hostname`] Hello from OpenEBS Local PV." >> /mnt/test/yellow.txt; sleep $(($RANDOM % 5 + 300)); done'
volumeMounts:
- mountPath: /mnt/test
name: local-storage
From here this means I can provision database services and other storage services to use the proper node storage resources. The next step is provisioning a database using this storage configuration.
Also from here, replication would be a nice to have between nodes. OpenEBS provides replication as an available option which I also plan to look into. (Because storage nodes at this point on VMs which are being backed up, replication isn’t a huge concern at this point in time.)
Links
- Kubernetes Local Volumes
- OpenEBS - OpenEBS LocalPersistentVolume
- OpenEBS - Self Managed Database Service like RDS
- Github - CSI Driver Localpv
- OpenEBS - OpenEBS Local PV Hostpath User Guide
- Safely Draining a Node
- Undraining a Node
- OpenEBS - OpenEBS on specific Servers
- OpenEBS on specific Servers
- Node Affinity
- OpenEBS Kubernetes Storage Concepts
- Velero
- Stack Overflow - PV vs PVC