Building for Compliance - Part I

Written by Gustavo Carvalho | Jul 3, 2023 1:41:05 PM

As Container Solutions grows and consults with more and more businesses, we notice common pain points encountered whenever we talk about Cloud Native platforms. One of these areas is compliance.

‘Born Right’

As is well known, a key aspect of DevSecOps consists of shifting security principles and responsibilities to the ‘left’ (ie earlier in the development cycle), leveraging automation to ensure that software entities are “born right”. “Born right” encompasses several aspects, including container security, runtime security, auditing and data regulations.

“Are you Sure You’ve Met All the Requirements?”

Once you’ve ‘shifted left’ and automated as much as you can to ensure that software entities such as VMs and security groups are ‘born right’, a challenge can arise from parts of your business related to security and audit: how do you know that the long list of security requirements for your application is being checked and enforced?

At this point, you explain to the audit function both the control measures and the curation process you have, to ensure that every possible service you are offering is approved by the local security authority to be compliant according to the relevant regulations.

You compromise, and decide to centralise every single deployment through a single standard process. You come up with ways to ensure identities and permissions are set correctly, show the control points (also called ‘gatechecks’: a verification that the control implemented in the code (“the gate”) is properly working) on your source code, evidence all the hardening of your Git repos to make sure it produces a valid audit trail, go to a (normally extensive) quest to show that everything is right there in the right place, and then the unexpected happens - a new technology appears, a new service is delivered by that cool startup, a new feature to your already curated service. And for any new product, service, or feature you have to do all of that again.

We Are Not Treating Compliance as a Customer

Why are compliance processes so often both troublesome and underdeveloped on a software level? One answer to that question is that we don’t consider compliance outputs to be an essential part of our software architecture. We take compliance requirements into account when we build (hopefully), but are not building around them: compliance is not our client.

This happens because typically, Agile frameworks tend to focus on the customers that are facing the Product Owner. Whether the customers are internal or external, they tend to be interested in the application interface and features. Feedback loops are created around them to make sure market fit is correct. And it makes sense for this to be the focus, but when we look at regulated environments this is when we see the performance drift - third party requirements not related to your business needs slowing down (if not completely stopping) the delivery.

To give another, more practical, example it's more common to see a Service Level Objective set up for an application’s response time than for its compliance status. We tend to focus on getting observability from our application in order to keep it running, but we do not get any observability data from our application compliance at all. As an industry, we are not tracking compliance in an automated way at all.

Modelling for Compliance-First

Having established that we should think of compliance observability as a key feature of the software we deliver, we next need to think about what aspects of compliance we need to observe. First, we need to be able to observe the software entity’s implementation: is the platform/application/solution built in a way that conforms to a set of (living) standards? Second: are there any running elements on that platform/application/solution that might be non-compliant due to drift ? By ‘drift’ I mean when a third party action causes one or more elements of your application to not be compliant.

Both these questions are complementary. One of them is focused on the lifecycle of the resources within the platform, which includes all our VMs, Pods, Databases, any PaaS/SaaS that is offered through the Platform, the networking stack, and so on. The other one focuses on the platform livelihood itself: does it still enable the creation of a compliant environment; does it enforce appropriately an environment that can be trusted? ‘Platform’ here can mean an on-premises data centre infrastructure, Public Cloud, or even Kubernetes clusters. They are all tools on top of which one can build applications.

Once a platform is stable, many different events or changes can move the system from compliant to non-compliant. These include the release of new features, the deprecation of old features, or a new security requirement.

The Elements of Compliance

Any full-fledged compliance framework should have the following capabilities:

Preventative, Detective, and Reactive Controls
Auditing and Attestation
Monitoring, Threat Detection, and Response
Training and Education

Preventative, Detective, and Reactive Controls

A set of rules defined at the platform to make sure the platform itself is compliant to the controls. These are typically ways to make sure no one cloud resource can be created with the non-compliant configuration in it. Tools that focus on this can come ‘out of the box’ with some providers (such as security groups in AWS, for example), while others are commodity third-party tools such as OPA Gatekeeper and Conftest, and still others can be custom-built.

Auditing and Attestation

This aspect of the system should check and record whether the given resource is compliant to a set of versioned controls. This is key in order to ensure the proper management of the requirement lifecycle.

Monitoring, Threat Detection, and Response

Non-compliant resources are bound to exist in the platform over as time passes, even if the platform itself was deemed compliant in the past (eg a policy update which causes old resources to fall out of compliance). Hence, automatic remediation, reporting capabilities, and a proper threat detection and response system also need to be in place. This is key to providing continuous compliance metrics, which is becoming more important to regulators as time goes on.

Training and Education

In order to support the framework, people need appropriate training on why all of these elements are in place and what their specific functions are.

Available Landscape and Tooling

Many tools exist - both open and closed source - that provide a framework for compliance. Among them there are the cloud agnostic ones such as wiz and fugue, provider-specific ones like Azure Defender for Cloud or Microsoft Sentinel, AWS Security Hub, and GCP Security Command Center.

The main issue with the provider-specific platforms is that they are a very good fit if, and only if, you are only using a specific cloud provider. By contrast, if you are looking for a cloud-agnostic solution, then the issue with the ones currently available is that their coverage lags behind the features and product delivery of the cloud providers.. Such cloud-agnostic products normally provide ways to input customised alarms, which means the engineering effort of properly configuring and managing the tool’s lifecycle still exists.

Closed source tooling is also quite expensive, most of the time only being used in highly regulated environments where their need is mandatory. In the open source landscape, the offerings are improving, with projects like cloud-custodian, openscap and complianceascode . There is a lot of standardisation also happening with initiatives such as OSCAL. It is promising to see these types of technologies emerge, but still they only tackle the ‘Preventative, Detective, and Reactive’ aspects mentioned above, and not covering the ‘Auditing and Attestation’ and ‘Monitoring, Threat Detection, and Response’ aspects at all.

Is your Platform Compliant?

A compliant platform allows your business to target more restricted markets, and also allows you to have confidence that common security breaches are not likely to happen. Even though the level is not set in stone, and might vary according to the company’s stage, size and market, every organisation would benefit from cloud compliance, in order to prevent the most common attack patterns.

Here we have focussed on a limited set of compliance aspects, but there are many others that we could broaden the discussion of tooling to cover, such as cyber recovery and operational resilience.

Implementing a compliance lifecycle management system can be a very onerous task even in organisations with dedicated security and audit teams. They may lack the technical skills to build such a solution, for example. In such situations, we at Container Solutions can use our experience to help you with the design and implementation of your Framework according to your context, whether you intend to use open source or enterprise tooling for your compliance strategy.

View full post