OPA Constraint Framework
Introduction
What is a Constraint?
A constraint is a declaration that its author wants a system to meet a given set of
requirements. For example, if I have a system with objects that can be labeled and
I want to make sure that every object has a billing
label, I might write the
following constraint YAML:
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: FooSystemRequiredLabel
metadata:
name: require-billing-label
spec:
match:
namespace: ["expensive"]
parameters:
labels: ["billing"]
Once this constraint is enforced, all objects in the expensive
namespace will be
required to have a billing
label.
What is an Enforcement Point?
Enforcement Points are places where constraints can be enforced. Examples are Git
hooks and Kubernetes admission controllers and audit systems. The goal of this
project is to make it easy to take a common set of constraints and apply them to
multiple places in a workflow, improving likelihood of compliance.
What is a Constraint Template?
Constraint Templates allow people to declare new constraints. They can provide the
expected input parameters and the underlying Rego necessary to enforce their
intent. For example, to define the FooSystemRequiredLabel
constraint kind
implemented above, I might write the following template YAML:
apiVersion: gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: foosystemrequiredlabels
spec:
crd:
spec:
names:
kind: FooSystemRequiredLabel
validation:
openAPIV3Schema:
properties:
labels:
type: array
items: string
targets:
- target: admission.k8s.gatekeeper.sh
libs:
- |
package lib.helpers
make_message(missing) = msg {
msg := sprintf("you must provide labels: %v", [missing])
}
rego: |
package foosystemrequiredlabels
import data.lib.helpers
violation[{"msg": msg, "details": {"missing_labels": missing}}] {
provided := {label | input.request.object.metadata.labels[label]}
required := {label | label := input.parameters.labels[_]}
missing := required - provided
count(missing) > 0
msg := helpers.make_message(missing)
}
The most important pieces of the above YAML are:
validation
, which provides the schema for the parameters
field for the constrainttargets
, which specifies what "target" (defined later) the constraint applies to. Note
that currently constraints can only apply to one target.rego
, which defines the logic that enforces the constraint.libs
, which is a list of all library functions that will be available
to the rego
package. Note that all packages in libs
must have lib
as
a prefix (e.g. package lib.<something>
)
Rego Semantics for Constraints
There are a few rules for the Rego constraint source code:
- Everything is contained in one package
- Limited external data access
- No imports
- Only certain subfields of the
data
object can be accessed:
data.inventory
allows access to the cached objects for the current target
- Full access to the
input
object
- Specific rule signature schema (described below)
Rule Schema
While template authors are free to include whatever rules and functions they wish
to support their constraint, the main entry point called by the framework has a
specific signature:
violation[{"msg": msg, "details": {}}] {
# rule body
}
- The rule name must be
violation
msg
is the string message returned to the violator. It is required.details
allows for custom values to be returned. This helps support uses like
automated remediation. There is no predefined schema for the details
object.
Returning details
is optional.
What is a Target?
Target is an abstract concept. It represents a coherent set of objects sharing a
common identification and/or selection scheme, generic purpose, and can be analyzed
in the same validation context. This is probably best illustrated by a few examples.
Examples
Kubernetes Admission Webhooks Create a Target
All Kubernetes resources are defined by group
, version
and kind
. They can
additionally be grouped by namespace, or by using label selectors. Therefore they
have a common naming and selection scheme. All Kubernetes resources declaratively
configure the state of a Kubernetes cluster, therefore they share a purpose.
Finally, they are all can be evaluated using a Validating Admission Webhook.
Therefore, they have a common validation context. These three properties make
Kubernetes admission webhooks a potential target.
Kubernetes Authorization Webhooks Create a Target
All Kubernetes requests can be defined by their type (e.g. CREATE
, UPDATE
,
WATCH
) and therefore have a common selection scheme. All Kubernetes requests
broadcast the requestor's intent to modify the Kubernetes cluster. Therefore they
have a common purpose. All requests can be evaluated by an authorization webhook
and therefore they share a common evaluation schema.
How Do I Know if [X] Should be a Target?
Currently there are no hard and fast litmus tests for determining a good boundary
for a target, much like there are no hard and fast rules for what should be in a
function or a class, just guidelines, ideology and the notion of orthoganality and
testability (among others). Chances are, if you can come up with a set of rules for
a new system that could be useful, you may have a good candidate for a new target.
Creating a New Target
Targets have a relatively simple interface:
type TargetHandler interface {
GetName() string
MatchSchema() apiextensions.JSONSchemaProps
Library() *template.Template
ProcessData(interface{}) (bool, string, interface{}, error)
HandleReview(interface{}) (bool, interface{}, error)
HandleViolation(result *types.Result) error
ValidateConstraint(*unstructured.Unstructured) error
}
The most interesting fields here are HandleReview()
, MatchSchema()
, and Library()
.
HandleReview()
HandleReview()
determinines whether and how a target handler is involved with a
Review()
request (which checks to make sure an input complies with all
constraints). It returns true
if the target should be involved with reviewing the
object and the second return value defines the schema of the input.review
object
available to all constraint rules.
MatchSchema()
MatchSchema()
tells the system the schema for the match
field of every
constraint using the target handler. It uses the same schema as Kubernetes' Custom Resource Definitions.
Library()
Library()
is a hook that lets the target handler express the relationship between
constraints, input data, and cached data. The target handler must return a Golang
text template that forms a Rego module with at least two rules:
matching_constraints[constraint]
- Returns all
constraint
objects that satisfy the match
criteria for
a given input
. This parameters
of this constraint
will be assigned
to input.parameters
.
matching_reviews_and_constraints[[review, constraint]]
- Returns a
review
that corresponds to all cached data for the target. It
also returns a constraint
for every constraint relevant to a review.
Values will be made available to constraint rules as input.parameters
and
input.review
.
Note that the Library()
module will be sandboxed much as constraint rules
are sandboxed, but with the following additional freedoms:
data.constraints
is availabledata.external
is available
To make it easier to write these rules and to allow the framework to
transparently change its data layout without requiring redevelopment work
by target authors, the following template variables are provided:
ConstraintsRoot
references the root of the constraints tree for the target. Beneath this root, constraints are organized by kind
and metadata.name
DataRoot
references the root of the data tree for the target. Beneath
this root, objects are stored under the path provided by ProcessData()
.
Integrating With an Enforcement Point
To effectively run reviews and audits, enforcement points need to be able to perform the
following tasks:
- Add/Remove templates
- Add/Remove constraints
- Add/Remove cached data
- Submit an object for a review
- Request an audit of the cached data
The Client type orchestrates these
operations between a set of targets and the backend policy system.
To facilitate enforcement point integration, Client exports following methods:
AddData(context.Context, interface{}) (*types.Responses, error)
RemoveData(context.Context, interface{}) (*types.Responses, error)
CreateCRD(context.Context, *templates.ConstraintTemplate) (*apiextensions.CustomResourceDefinition, error)
AddTemplate(context.Context, *templates.ConstraintTemplate) (*types.Responses, error)
RemoveTemplate(context.Context, *templates.ConstraintTemplate) (*types.Responses, error)
AddConstraint(context.Context, *unstructured.Unstructured) (*types.Responses, error)
RemoveConstraint(context.Context, *unstructured.Unstructured) (*types.Responses, error)
ValidateConstraint(context.Context, *unstructured.Unstructured) error
Reset(context.Context) error
Review(context.Context, interface{}) (*types.Responses, error)
Audit(context.Context) (*types.Responses, error)
Dump(context.Context) (string, error)
CreateCRD()
has a unique signature because it returns the Kubernetes Custom
Resource Definition that can allow for the creation of constraints once registered.
Requests to a client will be multiplexed to all registered targets. Those targets who self-report
as being able to handle the request will all be able to add response values.
types.Responses
is a wrapper around zero or more Result
objects. Each result
object has the following fields:
type Result struct {
Msg string `json:"msg,omitempty"`
Metadata map[string]interface{} `json:"metadata,omitempty"`
Constraint *unstructured.Unstructured `json:"constraint,omitempty"`
Review interface{} `json:"review,omitempty"`
Resource interface{}
}
Instantiating a Client
Here's how to create a client to make use of the framework:
driver := local.New()
backend, err := client.Backend(client.Driver(driver))
if err != nil {
return err
}
cl, err := backend.NewClient(client.Targets(target1, target2, target3))
Local and Remote Clients
There are two types of clients. The local client creates an in-process instance of OPA
to respond to requests. The remote client dials an external OPA instance
and makes requests via HTTP/HTTPS.
Debugging
There are three helpful levers for debugging:
Client.Dump()
returns all data cached in OPA and every module created in OPA- Drivers can be initialized with a tracing option like so:
local.New(local.Tracing(true))
.
These traces can then be viewed by calling TraceDump()
on the response. - Traces can be performed on a per-request basis for
Audit()
and Review()
requests by providing the client.Tracing(true)
option argument. Example: results_with_tracing := c.Audit(context.Background(), client.Tracing(true))