on
Renewing Kubernetes certificates
Renewing Kubernetes certificates
Overview
Kubernetes uses many different TLS certificates for various levels of internal and external cluster service communication such as kubelet, apiserver, scheduler to name a few. Usually these certificates are only valid for 12 months.
These certificates are created (and signed by the K8s internal CA) during initial installation. However even though there are some options for automated renewals available, they are not always utilised and these certs can become out of date. Updating certain certificates may require restarts of K8s components, which may not be fully automated.
If any of these certificates is outdated or expired, it will stop parts or all of your cluster from functioning correctly. Obviously this scenario should be avoided - especially in production environments.
This blog entry focuses on manual renewals / re-creation of Kubernetes certificates.
Scope
Below is a list of K8s (1.16) internal files (on each master node) which include certificates.
/etc/kubernetes/admin.conf
/etc/kubernetes/controller-manager.conf
/etc/kubernetes/scheduler.conf
/etc/kubernetes/pki/apiserver.crt
/etc/kubernetes/pki/apiserver-kubelet-client.crt
/etc/kubernetes/pki/apiserver-etcd-client.crt
/etc/kubernetes/pki/ca.crt
/etc/kubernetes/pki/etcd/healthcheck-client.crt
/etc/kubernetes/pki/etcd/peer.crt
/etc/kubernetes/pki/etcd/server.crt
/etc/kubernetes/pki/front-proxy-ca.crt
/etc/kubernetes/pki/front-proxy-client.crt
/var/lib/kubelet/pki/kubelet.crt
/var/lib/kubelet/pki/kubelet-client-current.pem
There are also some certificates on each worker node, mainly used by kubelet
.
/etc/kubernetes/kubelet.conf
/etc/kubernetes/pki/ca.crt
/var/lib/kubelet/pki/kubelet.crt
/var/lib/kubelet/pki/kubelet-client-current.pem
Status check
Kubernetes conveniently offers kubeadm
command line options to verify certificate expiration. As you can see below, all certificates are still valid for almost a year in this cluster.
$ kubeadm alpha certs check-expiration
CERTIFICATE EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
admin.conf Apr 05, 2021 19:17 UTC 363d no
apiserver Apr 05, 2021 19:17 UTC 363d no
apiserver-etcd-client Apr 05, 2021 19:17 UTC 363d no
apiserver-kubelet-client Apr 05, 2021 19:17 UTC 363d no
controller-manager.conf Apr 05, 2021 19:17 UTC 363d no
etcd-healthcheck-client Apr 05, 2021 19:17 UTC 363d no
etcd-peer Apr 05, 2021 19:17 UTC 363d no
etcd-server Apr 05, 2021 19:17 UTC 363d no
front-proxy-client Apr 05, 2021 22:41 UTC 363d no
scheduler.conf Apr 05, 2021 19:17 UTC 363d no
Alternatively you can use openssl
to verify the expiry time when connecting to the apiserver endpoint (can also be used to verify that the apiserver has been restarted since renewing the certificate):
# echo | openssl s_client -showcerts -connect 127.0.0.1:6443 -servername api 2>/dev/null | openssl x509 -noout -enddate
notAfter=Apr 5 19:17:16 2021 GMT
Manual renewal process
It is best practice to backup the /etc/kubernetes/pki
folder on each master before renewing certificates.
All Kubernetes certificates can be re-created via kubeadm
. Some certificates are specific to each master node name and some are shared across each service across different master servers (if any).
These commands will update / overwrite the corresponding certificate files as described above:
kubeadm alpha certs renew apiserver-kubelet-client
kubeadm alpha certs renew apiserver
kubeadm alpha certs renew front-proxy-client
kubeadm alpha certs renew apiserver-etcd-client
kubeadm alpha phase kubeconfig user --client-name system:kube-controller-manager
Renewing a certificate also requires the corresponding Kubernetes containers to be restarted. In most cases just deleting the pod (such as kubectl delete pod -n kube-system kube-scheduler-master1
) or restarting kubelet will cause the containers / pods to be restarted and to read the new certificates.
The kube-apiserver
process / pod uses many different TLS certificates - so this should ideally be restarted if any certificate changes / gets updated.
We have experienced problems with deleting the api-server pods using kubectl delete pod -n kube-system kube-apiserver-master1
. The command completes as you would expect, that is, the pod resets the age to 0 seconds and the status temporarily transitions to Pending
before returning to Running
. However, the docker container does not actually get restarted!
Non-restarting api-server pods can be identified by running docker ps | grep kube-apiserver
on the master. If the docker container uptime has not been reset, then the container can be killed via:
docker rm -f `docker ps | grep k8s_kube-apiserver | cut -d" " -f1`
We have experienced issues with this problem in the past, where the Kubernetes API server failed to communicate with the metrics-server
(kubectl top
and the HPA
stopped working). The process was just logging Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
as the error message. The root cause was an expired front-proxy-client
certificate (which was renewed recently without explicitly restarting the kube-apiserver
containers).
Other certs
It is quite common to also have TLS certificates in config maps, which are used by ingress controllers etc. These also need to get renewed in a timely manner.
Conclusion
Kubernetes is quite complex and there are many moving parts which need to be maintained. Even though Reece has operated production Kubernetes clusters for years, we are still constantly learning and experiencing interesting challenges in this ever-changing space.
More information here: