Dynamic Admission Control in Kubernetes

Preface

Introduction

At reecetech, it is common for personnel to engage with build and deployment pipelines. This also entails configuration of Helm Charts. With consideration that anybody can make a mistake, it only seems sensible to put safeguards within these delivery pipelines.

Everyone has a different scope and agenda, so holistically the system can be overlooked and lead to misconfiguration. An example of this may be a Software Engineer desires more Memory and/or CPU to boost their application performance. But this resource greediness could starve neighbouring applications.

Out of similar circumstances, our internal release tooling was further configured to block anything deemed invalid according to a range of bespoke policies.

In efforts to be more Kubernetes-centric, but with worry of side-effects from leaning further right, a decision was made to demo a use case for Admission Control in Kubernetes in the form of Validating Webhook.

What is an Admission Controller in Kubernetes?

An Admission Controller in Kubernetes can be thought of as a Gatekeeper. A server that would intercept API requests which may change the request object (Admission Mutation) or accept/deny the request (Admission Validation).

Validating Webhook Core Components

The core components in implementing Validating Webhook in Kubernetes consists of four key resource kinds:

  1. Pod to run the Admission Control Server.
  2. Service to expose the pod within the cluster.
  3. ValidatingWebhookConfiguration to capture object admissions.
  4. Secret to store a certificate and key to be used for Transport Layer Security.

Since webhooks must be served via HTTPS, the server would require certificates to function with Transport Layer Security. Therefore, the TLS secret becomes a direct dependency.

Implementation

I personally consider applying admission control as three stage process. Chronologically speaking as:

  1. Admission Control Server Development
  2. Helm Chart Curation
  3. Certificate Management

This pertains to testability during gradual enacting of each stage, and to assist in clarity if errors may arise.

So, let’s delve deeper in these stages in breaking down the implementation of Validating Webhook.

Admission Control Server Development

On capture of an admission request from a Validating/Mutating Webhook, the Kubernetes API server makes a HTTPS POST request to a configured service and URL path. To continue Kubernetes-centric ideals, the server was developed with Golang to serve such URL paths as HTTPS API endpoints.

Here is a stripped-down example of admission validation on a Job Kubernetes resource where the container name cannot be contain-yourself:

  1. We first need to serve HTTP with TLS by referencing Certificate and Key files.

    var server := NewServer( portNumber )
    if err := server.ListenAndServeTLS( "/etc/certs/tls.crt", "/etc/certs/tls.key" ); err != nil {
        // failed to listen and serve
    }
    
  2. Have function for NewServer() to provide a HTTP server with the required configuration and routing.

    func NewServer( portNumber string ) *http.Server {
        mux := http.NewServeMux()
    
        // define server routes
        mux.Handle( "/validate/jobs", ServeAdmissionValidation() )
    
        return &http.Server{
            Addr:    fmt.Sprintf(":%s", portNumber),
            Handler: mux,
        }
    }
    
  3. Provide the ServeAdmissionValidation() handler for AdmissionReview decoding/encoding and data review.

    func ServeAdmissionValidation() http.HandlerFunc {
        return func(w http.ResponseWriter, r *http.Request) {
            // set response header to define as json
            w.Header().Set("Content-Type", "application/json")
    
            // try read request body
            body, err := io.ReadAll(r.Body)
            if err != nil {
                // could not read request body
            }
    
            // try decode admission review
            var review admissionApi.AdmissionReview
            if _, _, err := serializer.NewCodecFactory(runtime.NewScheme()).UniversalDeserializer().Decode(body, nil, &review); err != nil {
                // could not deserialize request
            }
    
            // call function to verify job validity and return admission response
            admissionResponse := ValidateJob(review.Request)
    
            // json encode admission review response data
            res, err := json.Marshal(admissionResponse)
            if err != nil {
                // could not encode response to json
            }
    
            // respond with ok status and response data
            w.WriteHeader(http.StatusOK)
            w.Write(res)
        }
    }
    
  4. Create rules for ValidateJob() to perform object parsing and verification of admission validity

    func ValidateJob(req *http.Request) admissionApi.AdmissionReview {
        // Parse Job type from raw json-encoded data within AdmissionRequest
        var jb batchApi.Job
        if err := json.Unmarshal(req.Object.Raw, &jb); err != nil {
            return admissionApi.AdmissionReview{
                Response: &admissionApi.AdmissionResponse{
                    UID:     req.UID,
                    Allowed: false,
                    Result:  &meta.Status{Message: err.Error()},
                },
            }
        }
    
        // verify Container Name
        for _, c := range jb.Spec.Template.Spec.Containers {
            if c.Name == "contain-yourself" {
                return admissionApi.AdmissionReview{
                    Response: &admissionApi.AdmissionResponse{
                        UID:     req.UID,
                        Allowed: false,
                        Result:  &meta.Status{Message: "container cannot be named 'contain-yourself'"},
                    },
                }
            }
        }
    
        return admissionApi.AdmissionReview{
            Response: &admissionApi.AdmissionResponse{
                UID:     req.UID,
                Allowed: true,
                Result:  &meta.Status{Message: "ok"},
            },
        }
    }
    

Even though the aforementioned code is not explicitly the approach of execution in our code base, the intent should be clear.

Helm Chart Curation

Configuration

Carrying on with the hypothetical of Jobs admission review, lets dissect the likely implementation of its associative Validating Webhook Configuration.

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  annotations:
    helm.sh/hook: post-install
webhooks:
- name: job-validating-webhook.reece.tech
  admissionReviewVersions: ["v1beta1", "v1"]
  clientConfig:
    caBundle: ""
    service:
      name: "admission-control-service"
      namespace: "hypothetical"
      port: "443"
      path: "/validate/jobs"
  failurePolicy: Fail
  rules:
  - apiGroups: ["*"]
    apiVersions: ["*"]
    resources: ["jobs"]
    operations: ["CREATE", "UPDATE"]
    scope: Namespaced
  sideEffects: None

Rules

Rules are used to determine if a request to the API server should be sent to the webhook. And the three key areas of interest here are Scope, Resources and Operations.

You can imagine that most permutations of Scope, Resources and Operations in combination can lean towards excessive and system exhaustive for admission control. Therefore, I would consider the above Job Validating Webhook Configuration to be sensibly strict. Considering a key point of concern was side-effects of shifting right, this allows us to restrict webhooks to only capture admissions of concern which fall within our admission validation policies.

Service Config

The service stanza within clientConfig is a reference to the service for this webhook. This is the connection from the Webhook Configuration to the Admission Control Server. So, using the the above configuration as a reference, our clientConfig instructs the Kubernetes API to “consult requests with the admission-control-service that lives in the hypothetical Namespace on port 443 to the URL /validate/jobs”.

Helm Hook

If you may not have noticed already, within the configuration under metadata.annotations the helm.sh/hook property is valued to post-install.

What this states is that this ValidatingWebhookConfiguration will be executed after all resources are loaded into Kubernetes. This is a protective layer to assure all other resources associated to the same release would not be denied admission. That is if they were being captured by the Validating Webhook Configuration.

Certificate Management

Though the purpose of this exercise was to demo a use case for validating webhooks, the motive was also to build something close to production value to lower the body of work in the future. In terms of provisioning a certificate for secure communication between our server and webhook, a choice was made utilise Certificate Signing Request’s in Kubernetes.

Why use Certificate Signing Requests?

Implementing CSR’s instead of fully self-signed certificates brings forth two choices in certificate adjudication:

How was this Implemented?

The certificate creation process was setup to function with the following chronological steps:

  1. Generate Private Key
  2. Generate Certificate Request
  3. Create Kubernetes Certificate Signing Request
  4. Approve the Certificate Signing Request as Kubernetes Administrator
  5. Download the approved Certificate
  6. Create Kubernetes TLS Secret to contain Certificate and Key

How does this relate to the Admission Control Server?

Do you remember the first point in Golang Development of the Admission Control server?

It was stated that the server should ListenAndServeTLS utilising certificate and key file addresses of /etc/certs/tls.crt and /etc/certs/tls.key

When describing a TLS Secret in Kubernetes Control, it outputs the following information. Do the filenames on the bottom line look familiar?

Name:         admission-tls
Namespace:    hypothetical
Labels:       <none>
Annotations:  <none>

Type:  kubernetes.io/tls

Data
====
tls.crt:  1314 bytes
tls.key:  1675 bytes

For this reason, we can simply provide the secret as a volume mount on the Admission Control server pod with following stanzas:

Thus, the certificate and key are available for filesystem access on the Admission Control server.

Results

Viability

In testing releases, the admission control seems to be a viable option. When the Validating Webhook blocks a resource within a Helm chart, the release is set to a failed status returning non-zero exit code. This proves ability in sustaining automated deployment pipelines.

Helm

One caveat for implementation is that Helm Chart installations most likely require the --atomic flag set. This way, we can avoid a segmented release due to partial resource admission denial, enforcing a “everything or nothing” stance.

Admission Evaluation Time

Through testing timing on a local cluster, in concerns of shifting-right, little to no difference was observed. The test cases observed for timing were:

Testing with no validating webhook over 10 iterations showed variance alone (in milliseconds). Without adding an excessive amount of complexity to the admission control server, I don’t think we could observe an admission evaluation time worth challenging its viability. With this being said, one could argue that with enough added complexity to largely effect evaluation time, then it would display an incorrect and misuse of the server.

Challenges Encountered

Certificate Subject Alternative Name

This issue was encountered when implementing the Certificate Signing Request upon a version of Kubernetes with a newer Golang release. The issue surrounds the deprecation of relying on the CommonName field on X.509 serving certificates as a host name when no Subject Alternative Names are present. Kubernetes Github v1.19.0-rc.2 ChangeLog

The resolution to this was supplying Subject Alternate Names in the configuration of the generated certificate signing request. Generate SSL certificates with Subject Alt Names on OSX

Helm Hook

This issue arose during development and use case testing. Where my Validating Webhook Configuration beat my Admission Control server to admission and readiness.

What occurred was I had setup a validating webhook and some policies for denying Pods. Considering that that the Admission Control server lived on a Deployment resource, I encountered a cyclic conflict. Where the Validating Webhook was trying to process the Pod request and give to the server for verifying validity. But instead, the very server to handle such requests was never allowed admission.

This was resolved with implementation of the Post-Install Helm Hook upon the Validating Webhook Configuration. Helm Hooks

Useful Resources