A good definition of Cloud Native is a method for optimizing your systems for the Cloud.
However, the downside of using that definition is surely the details must change every time the Cloud does? And AWS released 500 updates in one quarter of 2018 alone! So is Cloud Native (CN) a moving target?
The Cloud Native Computing Foundation (CNCF) used to define CN as being about scale and resilience, or “distributed systems capable of scaling to tens of thousands of self healing multi-tenant nodes.” That's not the description on their website anymore, presumably because it's not the main reason most enterprises adopt it.
Scale and resilience are fantastically useful for companies like Uber or Netflix whose business strategy is being hyperscale and taking over the entire world. But that’s not everyone’s goal (at least not immediately).
Today, the CNCF give a far less specific description, “Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds”. Hmm. Fairly waffly.
In our opinion, Cloud Native is an approach to system architecture that exploits the cloud's two most fundamental features: flexible, on demand infrastructure that's easy to provision; and managed operational services like relational databases or queues.
That will do as a definition of what CN is. What it doesn’t tell us much about is the all-important why. Superficially, the generally accepted objective of a Cloud migration is to improve feature velocity, scalability, or, potentially, costs. Is that correct? And is it enough?
For most SMEs moving to cloud-optimized architectures, we find the reason is usually to increase their “feature velocity”. But again, why do they want to do that?
Feature Velocity
There is strategic advantage in being able to get ideas to market fast, learn from them in the field and then feed that back into the next iteration (aka "feedback loops"). This is about trying out a new idea in front of users cheaply in minutes rather than expensively in months. Major technical changes are required to make this happen, but the most challenging part is usually to get this approach accepted within a company culture historically based on delivering big projects. It's a huge shift in attitude. The CN aim is to de-risk technical change. This allows companies to delegate more aggressively, thus allowing change to happen more quickly and the company to become more responsive overall. The results are commonly a 500 or 1000-fold increase in the speed at which ideas can move from the inside of someone’s head to being in front of a user.
Scale
As businesses grow, it hopefully becomes necessary to support more users, in more locations, with a broader range of devices; while maintaining responsiveness, managing costs, and not falling over.
Costs
With on-demand infrastructure, we can choose to pay for additional resources only as they’re needed – when new customers come online. Spending moves from up-front CAPEX (buying new machines in anticipation of success) to OPEX (paying for additional servers on demand). This doesn’t necessarily make it cheaper. It just makes it different - and less risky.
Is That All There Is to Cloud Native?
At its heart, however, a Cloud Native strategy is not fundamentally about speed, scale or margin. These are all merely useful side-benefits. CN is about building an organisation that can change more quickly and nimbly by removing the technical risks associated with that change. In the past, our standard approach to avoiding danger was to move slowly and carefully. The Cloud Native approach is about moving quickly by taking small, reversible and low-risk steps.
For an online business, this is revolutionary. Companies can potentially vary their products at the same speed as they can update website content using their content management system. It’s extremely powerful -- but it isn’t free, and it isn’t easy. Going Cloud Native is a huge cultural shift as well as a technical challenge.
How Does Cloud Native Work?
The fundamentals of Cloud Native have been described as container packaging, dynamic management and a Microservices-oriented architecture.
We believe Cloud Native is actually about adopting five architectural principles, which is hard, plus two cultural ones -- which is even more difficult.
1 - Use servers-as-a-service
Use infrastructure or platform-as-a-service: run on computer resources that can be flexibly provisioned on demand like those provided by AWS, Google Cloud, or Microsoft Azure. Flexible, on-demand infrastructure and managed operational services are together known as a Cloud.
2 - Adopt extreme automation
Replace manual tasks with scripts or code. For example, use automated test suites, CI/CD, automated deployments and automated monitoring.
3 - Adopt decoupled architecture
Design systems using, or evolve them towards, a microservices architecture where individual components are small and decoupled.
4 - Encapsulate processes
Package processes together with their dependencies making them easy to test, move and deploy. This is often containers, but it can be VMs or serverless.
5 - Use automated orchestration
Abstract away individual servers in production and automate management of runtime environments using off-the-shelf dynamic management and orchestration tools such as Kubernetes.
6 - Delegate!
Give individuals the tools, training and discretion they need to safely make changes, deploy and monitor them as autonomously as possible (i.e. without a slow management approval process).
7 - Adopt dynamic strategy
Communicate strategy to the individuals above but plan for failure and learning to then modify that strategy. That is the ultimate purpose of the fast, experimental deployment that CN provides. There’s no point running experiments if you don’t then act on the results.
These steps have many benefits...but ultimately they are about reducing technical risk-taking (a bad thing) and getting more creative risk-taking (a good thing).
What Keeps You Awake At Night Now?
Over a decade ago in a small enterprise I lay awake at night wondering what was actually running on the production servers, whether we could reproduce them and how reliant we were on individuals and their ability to cross a busy street. Then I’d worry about whether we’d bought enough hardware for the current big project, or whether our single data center was going to collapse. We saw these as our most unrecoverable risks. Finally, I worried about new deployments breaking the existing complex, monolithic services. That didn’t leave much time for imaginative ideas about the future. In fact, people having imaginative ideas was another thing to worry about.
In that pre-cloud world, we had no choice but to move slowly, spending lots of time on planning, testing and documentation before we made our purchasing decisions. That was absolutely the right thing to do then to control technical risk. However, the question now is “is moving slowly our only option?” In fact, is it even the safest option any more? We’d argue that slow, methodical progress is not only not required, it’s existentially dangerous because it doesn’t create a good environment for learning.
We’re not considering the Cloud Native approach because it’s fashionable – although it is. We have a pragmatic motivation: the approach appears to work well with continuous delivery, provide faster time to value, scale well and be efficient to operate. However, most importantly it seems to help reduce technical risk in a new way – by going fast but small.
However, that’s still not the win.
At the start, we asked if Cloud Native was a moving target. The answer is that the Cloud is indeed a moving target, and so is consumer demand. Cloud Native is just a technical approach to handle that constant change. The win of CN is reducing technical risk so you can afford to change - to take the creative risks and experiments that keep a business competitive in an unstable world. If you have the culture to do it.
Want to learn more about Cloud Native? Download our free eBook below: