on
Localise debugging of OPA Gatekeeper Rego Policy
Introduction
Background
On November 2022, in a attempt to better enforce policy in our Kubernetes clusters, we at Reecetech adopted OPA Gatekeeper.
Successor to the Open Policy Agent project, Gatekeeper is a Kubernetes-centric offering that serves an admission controller utilising OPA. This admission controller provides a way to manage and enforce policies using OPA’s expressive Rego
policy language.
Just like any new adopters of any piece of tech, off we headed to the documentation.
Following the reading of the documentation and assuming a full embrace of Gatekeeper, you may have landed in the same stack as us; TL;DR:
- A library of policies containing your
templates
andconstraints
to enforce - Kubernetes objects to test violation or conformance against
- Exercising the use of Gator CLI to assert test cases against policies
Awesome! So however we decide to curate our policies, we can test our policies locally and through continuous integration.
Disallowing Anonymous Access
Our efforts in localising the debugging of Gatekeeper Rego policy stemmed from the implementation of a policy. This policy was to disallow roles to exist which allow rights to anonymous users within our Kubernetes clusters. It can be performed by disallowing admission of role bindings tied to the system:unauthenticated
and system:anonymous
Kubernetes groups.
This implementation was copied from the OPA Gatekeeper project as they provide a policy to Disallow Anonymous Access through their community-driven library.
Later after implementation of the aforementioned policy, by pure chance, we encountered the existence of the following Cluster Role Binding:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: oidc-reviewer
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:service-account-issuer-discovery
subjects:
- kind: Group
name: system:unauthenticated
apiGroup: rbac.authorization.k8s.io
This cluster role binding essentially allows an unauthenticated user to access URL paths for support on how to authenticate, and access keys for identity verification.
This was not showing up as a violation in any cluster. What’s going on? This clearly is binding to the disallowed group. There was a need to poke around the test framework.
Setting Up
Structure
Let’s delve into the setup of the directory structure.
.
└── policy-library
└── k8sdisallowanonymous
├── template.yaml
├── constraint.yaml
├── suite.yaml
└── tests
└── oidc-reviewer.yaml
template.yaml
: disallow anonymous access policy copied from Disallow Anonymous AccessFile Contents 📄
apiVersion: templates.gatekeeper.sh/v1 kind: ConstraintTemplate metadata: name: k8sdisallowanonymous annotations: metadata.gatekeeper.sh/title: "Disallow Anonymous Access" metadata.gatekeeper.sh/version: 1.0.0 description: Disallows associating ClusterRole and Role resources to the system:anonymous user and system:unauthenticated group. spec: crd: spec: names: kind: K8sDisallowAnonymous validation: # Schema for the `parameters` field openAPIV3Schema: type: object properties: allowedRoles: description: >- The list of ClusterRoles and Roles that may be associated with the `system:unauthenticated` group and `system:anonymous` user. type: array items: type: string targets: - target: admission.k8s.gatekeeper.sh rego: | package k8sdisallowanonymous violation[{"msg": msg}] { not is_allowed(input.review.object.roleRef, input.parameters.allowedRoles) review(input.review.object.subjects[_]) msg := sprintf("Unauthenticated user reference is not allowed in %v %v ", [input.review.object.kind, input.review.object.metadata.name]) } is_allowed(role, allowedRoles) { role.name == allowedRoles[_] } review(subject) = true { subject.name == "system:unauthenticated" } review(subject) = true { subject.name == "system:anonymous" }
constraint.yaml
: configuration to enforce the disallow anonymous access policy templateFile Contents 📄
apiVersion: constraints.gatekeeper.sh/v1beta1 kind: K8sDisallowAnonymous metadata: name: no-anonymous spec: enforcementAction: deny match: kinds: - apiGroups: ["rbac.authorization.k8s.io"] kinds: ["ClusterRoleBinding"]
suite.yaml
: Gator CLI test cases for the constraintFile Contents 📄
kind: Suite apiVersion: test.gatekeeper.sh/v1alpha1 metadata: name: noanonymous tests: - name: no-anonymous template: template.yaml constraint: constraint.yaml cases: - name: test-oidc-reviewer object: tests/oidc-reviewer.yaml assertions: - violations: 'yes' - message: Unauthenticated user reference is not allowed violations: 1
oidc-reviewer.yaml
: the introduced oidc-reviewer CRB object in questionFile Contents 📄
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: oidc-reviewer roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:service-account-issuer-discovery subjects: - kind: Group name: system:unauthenticated apiGroup: rbac.authorization.k8s.io
Testing
Using the preceding file and directory structure, we can assert the test suite through the Gator CLI. No need to install anything, we can just leverage the Gator CLI docker container.
docker run --rm \
-v ${PWD}/policy-library:/home/nonroot/ \
openpolicyagent/gator:v3.14.0 \
verify -v k8sdisallowanonymous/...
Where the following output resulted:
=== RUN no-anonymous
=== RUN test-oidc-reviewer
--- FAIL: test-oidc-reviewer (0.002s)
unexpected number of violations: got 0 violations but want at least 1: got messages []
--- FAIL: no-anonymous (0.007s)
FAIL k8sdisallowanonymous/suite.yaml 0.007s
FAIL
Once again, we are expecting a violation with a message to assert the issue, but no complaint from the policy. Gator CLI is voicing that the tests were not asserted correctly. Sounds like we might need to debug the policy.
Debugging
Referring back to documentation, Gatekeeper provide some easy-to-follow steps for debugging by Viewing the Request Object.
Fetching the Admission Review Object
This is done by forcing a violation by outputting the admission review input object.
diff --git a/policy-library/k8sdisallowanonymous/template.yaml b/policy-library/k8sdisallowanonymous/template.yaml
index 56a3d84..b508e5d 100644
--- a/policy-library/k8sdisallowanonymous/template.yaml
+++ b/policy-library/k8sdisallowanonymous/template.yaml
@@ -30,6 +30,10 @@ spec:
rego: |
package k8sdisallowanonymous
+ violation[{"msg": msg}] {
+ msg := sprintf("REVIEW OBJECT: %v", [input])
+ }
+
violation[{"msg": msg}] {
not is_allowed(input.review.object.roleRef, input.parameters.allowedRoles)
review(input.review.object.subjects[_])
We chose to output the entire admission review
input
object by adding the preceding code to the beginning of the disallow anonymous access template. This would allow for easy copy-paste and testing against policy.
After re-running the Gator CLI test container, we received the following request object from standard output.
{"parameters": {}, "review": {"kind": {"group": "rbac.authorization.k8s.io", "kind": "ClusterRoleBinding", "version": "v1"}, "name": "oidc-reviewer", "object": {"apiVersion": "rbac.authorization.k8s.io/v1", "kind": "ClusterRoleBinding", "metadata": {"name": "oidc-reviewer"}, "roleRef": {"apiGroup": "rbac.authorization.k8s.io", "kind": "ClusterRole", "name": "system:service-account-issuer-discovery"}, "subjects": [{"apiGroup": "rbac.authorization.k8s.io", "kind": "Group", "name": "system:unauthenticated"}]}}}
Cool. I followed the debugging guide. What do I do with this? And how can I make use of this admission review object with our policies?
One unfortunate thing about the debugging guide on Gatekeeper is that they teach you how to fetch the admission review object, but not how to use it. It is not easily understood to new learners of the Gatekeeper tool set. In terms of granular debugging, re-running the Gator CLI with minor amendments may end up being less effective when scoping in on the low-level. What would be better would be to play in a Rego environment directly.
Locally debugging policy
To perform debugging locally, we can run a OPA container which provides Rego language in a Read-Evaluate-Print Loop (REPL).
docker run -it --rm openpolicyagent/opa
Since we are working with a REPL Rego policy engine, lets predefine the underlying rules and constants prior to defining the violation set rule.
-
input := {"parameters": {}, "review": {"kind": {"group": "rbac.authorization.k8s.io", "kind": "ClusterRoleBinding", "version": "v1"}, "name": "oidc-reviewer", "object": {"apiVersion": "rbac.authorization.k8s.io/v1", "kind": "ClusterRoleBinding", "metadata": {"name": "oidc-reviewer"}, "roleRef": {"apiGroup": "rbac.authorization.k8s.io", "kind": "ClusterRole", "name": "system:service-account-issuer-discovery"}, "subjects": [{"apiGroup": "rbac.authorization.k8s.io", "kind": "Group", "name": "system:unauthenticated"}]}}}
Response:
Rule 'input' defined in package repl. Type 'show' to see rules.
-
is_allowed(role, allowedRoles) { role.name == allowedRoles[_] }
Response:
Rule 'is_allowed' defined in package repl. Type 'show' to see rules.
-
review(subject) = true { subject.name == "system:unauthenticated" }
Response:
Rule 'review' defined in package repl. Type 'show' to see rules.
-
review(subject) = true { subject.name == "system:anonymous" }
Response:
Rule 'review' defined in package repl. Type 'show' to see rules.
Now we are in a position that we can test the Rego policy so we can take a look at the violation function:
violation[{"msg": msg}] {
not is_allowed(input.review.object.roleRef, input.parameters.allowedRoles)
review(input.review.object.subjects[_])
msg := sprintf("Unauthenticated user reference is not allowed in %v %v ", [input.review.object.kind, input.review.object.metadata.name])
}
On attempt to fetch the violation, the following occurs:
violation[{"msg": msg}]
Response: undefined
With Rego this can occur if the body of the violation rule never evaluates to
true
. Syntactically speaking, this makes sense, as a “violation” would be a “violation” if it is true. If a violation isfalse
orundefined
, then it isn’t a violation. Rego treats rule bodies the same. The violation is not assigned due to this behaviour.
So what condition within the violation set is not resolving?
Let’s go more granular
We need to assess the conditions within the violation set assignment to hone in on the underlying issue.
For a pure sanity check, let’s see if the msg
assignment resolves:
-
Response:sprintf("Unauthenticated user reference is not allowed in %v %v ", [input.review.object.kind, input.review.object.metadata.name])
"Unauthenticated user reference is not allowed in ClusterRoleBinding oidc-reviewer "
Noundefined
result here, looks like it resolved correctly.
Now lets check the review()
rule which assesses if the group is one of the violating groups:
-
Response:review(input.review.object.subjects[_])
true
; No problems here.
Well it must be the is_allowed()
rule. Lets assess the value.
-
Response:not is_allowed(input.review.object.roleRef, input.parameters.allowedRoles)
undefined
; Found You! 👈 Since the rule is just comparing the values between theis_allowed
parameters, we can just display the parameter values.-
Response:input.review.object.roleRef
{"apiGroup":"rbac.authorization.k8s.io","kind":"ClusterRole","name":"system:service-account-issuer-discovery"}
-
Response:input.parameters.allowedRoles
undefined
-
Looks like the input parameters of allowedRoles
is the culprit, and the Rego policy cannot support the lack of presence of this parameter.
Ideally we would have liked to have the parameters mentioned in the constraint as:
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sDisallowAnonymous
metadata:
name: no-anonymous
spec:
enforcementAction: deny
match:
kinds:
- apiGroups: ["rbac.authorization.k8s.io"]
kinds: ["ClusterRoleBinding"]
+ parameters:
+ allowedRoles:
+ - some-allowed-anonymous-role
This is also apparent with the admission review input
object containing an empty collection of parameters:
{"parameters": {}, "review": {"kind": ...
In closing, with the capability to be strict on defining a violation to be true
or false
, there is no need for undefined
response behaviour.
An undefined
in Rego expression seems like a mishandling to be mindful of.
OPA Query-explanation/Tracing
For those adept with the OPA Gatekeeper tool set, or just a keen eye with future prospects in where’s waldo, likely would have caught the issue earlier.
When performing debugging withing the OPA container, there is the option to enable tracing/query-explanations.
- Submitting the command
trace full
will enable full tracing/query-explanationSee More ℹ️
- Can be affirmed when you
show debug
:-
{ - "explain": "off", + "explain": "full", "metrics": false, "instrument": false, "profile": false, "strict-builtin-errors": false }
-
- When performing the following violation assignment in the REPL, should return the following response:
violation[{"msg": msg}]
Enter data.repl.violation[{"msg": msg}] = _ | Eval data.repl.violation[{"msg": msg}] = _ | Index data.repl.violation (matched 1 rule) | Enter data.repl.violation | | Eval __local6__ = data.repl.input.review.object.roleRef | | Index data.repl.input (matched 1 rule, early exit) | | Enter data.repl.input | | | Eval true | | | Exit data.repl.input early | | Eval __local7__ = data.repl.input.parameters.allowedRoles | | Index data.repl.input (matched 1 rule, early exit) | | Fail __local7__ = data.repl.input.parameters.allowedRoles | | Redo __local6__ = data.repl.input.review.object.roleRef | | Redo data.repl.input | | | Redo true | Fail data.repl.violation[{"msg": msg}] = _ undefined
- 🧐
Fail __local7__ = data.repl.input.parameters.allowedRoles
- 🧐
- Can be affirmed when you