WTF Is Cloud Native, Cloud Native Operations

Regulating the Cloud for the Financial Services Sector

In my White Paper - Banking on the Cloud, written in the summer of 2022, I highlight the increasing direct oversight that regulators are starting to have over the Cloud Service Providers (CSP).

Given the ongoing discussion at Monzo Bank on the topic I thought that I would revisit it both to see how things have progressed and assess the potential impacts on the regulators, cloud providers, the financial services industry and other tech companies offering services to them.

What powers do the regulators have over the big cloud providers?

The first question you might ask is why do the regulators need to be directly involved with the cloud providers and what powers will they have?

For the UK, financial firms are tending to concentrate their application hosting in a few regions of the three main cloud providers (AWS, Azure, GCP), which can present a systemic risk in the case of an outage at one of the cloud providers. You can imagine a situation, for example, where a large number of UK financial companies might need to simultaneously move services to a different availability zone within the affected CSP.

The kind of thing the regulator wants to know is whether there would be sufficient capacity to host the required critical applications so that UK individuals, businesses and other financial service firms aren’t impacted. If not, the CSPs could be fined, or even barred from the market in an extreme case.

Currently, the financial firms themselves are accountable for ensuring that their applications and suppliers meet availability and recoverability standards, but this broader risk requires the regulators to have insight and awareness of the operational aspects of the cloud and other technology providers to the finance industry. The individual banks themselves will still be responsible for both their and their suppliers' operational resilience meeting the required levels. This is no different to other services that represent large single points of failure such as the SWIFT or CHAPS payment systems.

What is the EUs approach to cloud provider regulation?

We should also note that the regulatory approach taken by the United Kingdom (UK) and the European Union (EU) are slightly different. In the EU the Digital Operational Resilience Act (DORA) will come into force in January 2025. It looks to manage the risks that can be present from the adoption of Information and Communication Technologies (ICT) and minimise the risks to the EU financial system. The overall intention of the act is to allow for cross border ICT risks to be understood at an EU level, which would not be possible for the individual member states.

DORA allows regulators to identify third parties and then to directly audit and regulate them. However the EU has yet to finalise a reporting framework for the regulated entities to identify and classify their third party providers so that they too could potentially be regulated. This framework is over a year away and is still maturing, which is perhaps understandable given the number of individual countries involved.

What are the UK regulatory bodies doing?

The UK regulatory bodies comprising the Bank of England (BoE), the Prudential Regulatory Authority (PRA), and the Financial Conduct Authority (FCA) published their initial discussion paper on Operational Resilience in 2021. The drivers behind the paper are stated as that

“the increasing reliance on a small number of CSPs and other critical third parties for vital services could increase financial stability risks in the absence of greater direct regulatory oversight of the resilience of the services they provide”.

The latest version can be found here, but it is currently not legally binding, and discussions are ongoing and evolving.

There is a framework in place to report and classify the dependent third parties that the respective financial institutions are using. The work has started on the cloud vendors but it will also likely identify and flag other critical third party providers that hadn’t been highlighted before and that will then fall into the remit of the regulators. The vehicle for the contents of the discussion paper to become law as per this policy paper, is this bill making its way through the House of Commons and the House of Lords.

What about the US and the rest of the world?

In the USA the Department of the Treasury recently produced this report. The recommendations it contains are aimed at helping financial institutions consume cloud services in a safe and secure manner. Discussions will continue with all relevant parties to help execute the recommendations and a Cloud Services Steering Group will be launched within the year.

The Financial Stability Board (FSB) reports to the G20 group of nations and brings together senior policy makers from ministries of finance, central banks and supervisory and regulatory authorities to coordinate financial sector policies. As well as the G20 it also includes four other key financial centres of Hong Kong, Spain, Switzerland and Singapore. As an organisation it will likely drive global alignment of regulators with broadly similar rules.

The impact on the banks

Overall, we see this regulatory oversight of cloud providers as a positive step for cloud adoption across the Financial Service Industry. The banks can’t leverage the work of the regulators directly and they still have to address their own concentration risks (for example, by spreading workloads across different cloud providers, or being able to move to a different cloud provider in a disaster recovery scenario) and manage their own third parties, but it will help with the overall awareness and maturity of conversation within this space.

For example, a bank will need to understand the Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO) for each cloud service, including when a number of these are combined to form higher level PaaS (Platform-as-a-Service) offerings. But, in addition to the banks all needing to ask for this individually, the regulators will also be asking the cloud providers for the same information, as well as the architectural inter-dependencies between their services. The latter is harder information for the individual banks to get, as (understandably) the cloud providers are generally guarded about this, as it could give a competitive advantage if it were in the public domain.

Another area that will benefit is to formalise and standardise the reporting of outages, performance issues or breaches by the cloud providers to the financial services consumers of their services.

In our experience, most banks use a mixture of on-premise and public cloud, and not all workloads in major banks are destined for the public cloud in the next few years. For example where data is classified as secret or there are data sovereignty restrictions, this prevents certain applications from migrating. So there will be a level of hybrid cloud adoption for at least the short to medium term.

In many regards it doesn’t really change an individual financial institution’s adoption of public cloud hosting, but it might have impacts in the future which I will cover in the next section. As Andrew Ellam at Monzo Bank points out, the risk and compliance areas of the bank will be working on these items and it will drive a closer conversation with the SREs working in this operational resilience space.

The impact on the cloud and other tech providers

The cloud providers will be set resilience targets by the regulators. How far-reaching and detailed these will be we don’t know as yet, but the regulators are not tech providers and do not know the architectural needs of the cloud design better than the cloud provider.

We’re not yet seeing this happen, but hypothetically it might encourage the cloud providers to split off part of their infrastructure and dedicate it to the financial industry in the same way that they have created gov (government dedicated) clouds today. The advantage of this is that it protects the majority of a cloud provider’s infrastructure from the scope of the regulators, and allows the cloud vendors to innovate faster elsewhere. Of course this comes with a downside—gov clouds don’t tend to get the latest updates or services released to them straight away. It also incurs additional costs for the cloud provider.

In the case of the gov clouds, it was worth the cloud vendor’s investment as they received a significant up front multi-year commitment to build them, but would this happen for financial services dedicated clouds? How will the regulators prioritise certain hosting industry sectors against the financial service industry in the case of an outage at a cloud provider? These are questions that will need to be answered.

Another option for tackling the operational resilience challenge for applications within a financial services company is to be able to leverage the hosting capabilities of multiple cloud providers. In practice this means either being able to move workloads between them relatively quickly or to have workloads spread natively across different cloud vendors.

This is still a developing area and Infrastructure as Code (IaC) tools such as Terraform, Crossplane and Pulumi play in this space. The use of Containers (RedHat Openshift being a container orchestration platform that runs on different CSPs) also helps portability to a degree. Then there are serverless ecosystems such as serverless.com and dapr.io (Distributed Application Runtime) amongst others that are available. There are also PaaS data services that are now available that will natively span clusters across CSPs or a CSP region, such as MongoDB Atlas, CockroachDB, or to host the service from different CSPs such as Confluent Kafka and Databricks.

If the above approaches continue to develop further then it is possible to envisage a time where an outage on one CSP potentially causes loads to be shifted to a different CSP and thus cascade the effect of an outage.

New call-to-action

Comments
Leave your Comment