on
Kubernetes - Utilising tmpfs volumes
There are multiple choices for volume types in Kubernetes land, from persistent storage volumes that can be backed by a cloud service like EFS
, to non-persistent storage types like ephemeral
volumes that disappear with your pod or the worker node backing it.
‘tmpfs is a file system which keeps all files in virtual memory’ kernel.org
If you’re after fast volatile storage, then tmpfs
might be the way, but some caveats should be considered before utilising it for your projects. The most obvious one is that tmpfs
won’t persist across restarts or crashes, which is slightly different from ephemeral
volumes in Kubernetes which will persist until the end of the pod’s lifecycle.
A typical use case for such fast volatile volumes, and what we at Reecetech investigated, is to back a database temp space, which is often used for temporary tables created during complex queries. While we didn’t necessarily get the results we were after that prompted this use case, we did learn a bit about how tmpfs
volumes are created and function within Kubernetes pods.
Preparing tmpfs
in Kubernetes
A tmpfs
volume can be explicitly created through the emptyDir
volume type and setting the medium
field to memory
. This is to define what storage type on the scheduled node will back the emptyDir
storage.
volumes:
- name: tmpfs-volume
emptyDir:
medium: memory # Memory type storage
One of the unclear caveats for a tmpfs
volume in Kubernetes is the correlation between the size of this volume and the memory available on the node that the pod is to be scheduled on.
Firstly, the maximum storage space that can be allocated to this tmpfs
volume is determined by the memory limits set for the container in the pod specification. This limit is crucial to remember as it defines the upper boundary for the tmpfs
volume size and isn’t quite clearly defined as in other volume types. Exceeding this limit can lead to Out of Memory (OOM) issues, thus restarting your pod and clearing the tmpfs
volume contents.
Secondly, it’s important to note that the memory used for this tmpfs
volume is shared with the container’s OS and not specifically allocated / restricted to it. If this volume becomes too large, it will starve other processes running in the container, including the OS itself, leading to OOM issues. This shared usage means that the allocation of memory to the tmpfs
volume can have a direct impact on the container’s overall performance and stability, requiring you to find the perfect balance between the application and the allocated volume.
Within the pod specification we define the volume, there exists another key
that we can set on our tmpfs
volume called sizeLimit
. As the name suggests, we can set a limit to how large this tmpfs
volume can get, and essentially prevent this volume from consuming more memory than we expect.
volumes:
- name: tmpfs-volume
emptyDir:
medium: memory # Memory type storage
sizeLimit: 200Mi
Note, however, that the memory backing the volume is still shared with the OS and there is no concept of this sizeLimit
separation from the operating system’s perspective other than setting a limit on the volume size.
This means that files created in this tmpfs
space will still take memory from the OS and still create the potential to starve resources from other processes. Processes running in the container can still utilise all memory up to the container memory limits defined in the Kubernetes YAML, including the space reserved for the tmpfs
volume.
Now let’s take a look at how to get a deployment with a container that contains tmpfs
storage running in a Kubernetes environment to play around with.
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-super-cool-app
labels:
app: 'my-super-cool-app'
spec:
replicas: 1
selector:
matchLabels:
app: my-super-cool-app
template:
metadata:
labels:
app: 'my-super-cool-app'
spec:
containers:
- name: my-super-cool-app-container
command:
- "sh"
- "-c"
- "while true; do sleep 6000; done"
image: busybox
volumeMounts:
- mountPath: /tmpfs-storage
name: tmpfs-storage # Referencing the tmpfs volume
resources:
limits:
cpu: 1
memory: 400Mi # Defining a maximum
requests:
cpu: 0.1
memory: 400Mi
volumes:
- emptyDir:
medium: Memory # tmpfs storage
sizeLimit: 200Mi # Restricting storage size
name: tmpfs-storage
We’ve now got a deployment YAML set up so that a busybox
pod will spin up with a 400Mi
memory limit and a tmpfs
volume attached at the directory /tmpfs-storage
.
Using a local Kubernetes cluster (like docker desktop’s built-in Kubernetes node) let’s deploy it!
❯ kubectl apply -f tmpfs-deployment.yaml
deployment.apps/my-super-cool-app created
❯ kubectl get pods
NAME READY STATUS RESTARTS AGE
my-super-cool-app-6bd6bd7bfb-g87k9 1/1 Running 0 41s
So now that we’re up and running, we can exec into the container and see what the current memory max and current usage are like.
❯ kubectl exec -it --tty my-super-cool-app-6bd6bd7bfb-g87k9 -- sh
/ # cd tmpfs-storage/
/tmpfs-storage # cat /sys/fs/cgroup/memory.current
1097728
/tmpfs-storage # cat /sys/fs/cgroup/memory.max
419430400
Depending on what container runtime you’re using, the filenames might be different. For example, in containerd, the current and max memory usage files might live at
cat /sys/fs/cgroup/memory/memory.usage_in_bytes
andcat /sys/fs/cgroup/memory/memory.limit_in_bytes
respectively.
Looking at the results from the series of commands, it lines up with what was defined in the Kubernetes deployment YAML, with roughly 400MB of maximum memory for the container and barely any memory usage currently.
tmpfs
in Action
Let’s simulate a process using some memory with the following head
command and quickly check the contents of the current memory usage file.
/tmpfs-storage # head -c 250m /dev/zero | tail &
/tmpfs-storage # ps -ef
PID USER TIME COMMAND
1 root 0:00 sh -c while true; do sleep 6000; done
13 root 0:00 sh
41 root 0:02 head -c 250m /dev/zero
42 root 0:00 tail
43 root 0:00 ps -ef
/tmpfs-storage # cat /sys/fs/cgroup/memory.current
233738240
/tmpfs-storage # cat /sys/fs/cgroup/memory.current
248315904
Now we have a quick way to generate a process with specific memory usage. How about a quick way to allocate space and mock some files being created in our new tmpfs
space?
/tmpfs-storage # cat /sys/fs/cgroup/memory.current
1052672
/tmpfs-storage # fallocate -l 100m /tmpfs-storage/example
/tmpfs-storage # ls -al
total 102404
drwxrwxrwt 2 root root 60 Dec 27 23:33 .
drwxr-xr-x 1 root root 4096 Dec 27 04:55 ..
-rw-r--r-- 1 root root 104857600 Dec 27 23:33 example
/tmpfs-storage # cat /sys/fs/cgroup/memory.current
106004480
We’ve allocated some space on our tmpfs
file system to a file called example
, and we can notice that when we check the current memory usage within the container, it has jumped from our usual low idle of a few megabytes to over 100MB to match the usage of our tmpfs
space.
We can see that by using the df
command as well:
/tmpfs-storage # df /tmpfs-storage/
Filesystem 1K-blocks Used Available Use% Mounted on
tmpfs 204800 102400 102400 50% /tmpfs-storage
Since we have set the limit of our tmpfs
space to be 200MB
, when we try to allocate another file that would bring us over that limit, we should be blocked from it right?
/tmpfs-storage # cat /sys/fs/cgroup/memory.current
106004480
/tmpfs-storage # fallocate -l 150m /tmpfs-storage/example2
fallocate: fallocate '/tmpfs-storage/example2': No space left on device
As expected from the sizeLimit
we defined in the Kubernetes YAML, we get an error preventing us from allocating more space than what we defined. Seems pretty straight-forward where we can see the relationship between tmpfs
, the limits we set on the volume with sizeLimit
and the limits we set for the entire container.
Now let’s flip the script and suggest that we have a memory-intensive container that has a tmpfs
volume and we try allocating space on the volume while low on memory.
/tmpfs-storage # cat /sys/fs/cgroup/memory.current
1060864
/tmpfs-storage # head -c 300m /dev/zero | tail &
/tmpfs-storage # fallocate -l 200m /tmpfs-storage/example
/tmpfs-storage # cat /sys/fs/cgroup/memory.current
377839616
[1]+ Killed head -c 300m /dev/zero | tail
The memory usage climbs after running our test process, it eventually reaches out of memory and kills the process as the tmpfs
file space has priority over memory.
You can imagine that if this was an integral process to a pod it could fail the liveness check in Kubernetes and trigger a restart of the entire container. To manage this, it would require some effort to calculate the balance between the memory requirements of your processes, OS and the tmpfs
space within your container such that they won’t conflict and cause the pod to get killed to out of memory issues.
What did we learn?
Setting up a tmpfs
volume is fairly straight-forward but there are some caveats to consider when using it specifically in a Kubernetes environment and particularly with production workloads. If there are any takeaways from this, it’s these few points:
- The size of the
tmpfs
mounted volume is limited by container limits defined in Kubernetes YAML. - We can further limit the size of the
tmpfs
volume within the container limits by using thesizeLimits
key in Kubernetes YAML, but ultimately this is still in that shared memory pool for the container. - Files created/modified in
tmpfs
are prioritised over running processes, so be careful as your running processes will be killed to make room for files intmpfs
.