Cloud Native Blog - Container Solutions

Cloud Native Maturity Matrix

Written by Pini Reznik | Jun 5, 2018 10:32:38 AM

When an enterprise organisation looks to transform itself into a Cloud Native entity, the transformation must be firmly rooted in understanding.

Too many companies see going to Cloud Native simply as going to Kubernetes. Blinders on, they see only the exciting new tech that everyone is buzzing about. But without understanding their own architecture, their own maintenance and delivery processes -- or, most crucially, their own internal culture and its pivotal role in the transformation process -- what they are rushing toward is instead an expensive waste of time and resources.

The Container Solutions approach begins the same way for any client: understanding their specific needs. While helping dozens of world-class companies like Adidas and Maersk through successful cloud migrations, CS has discerned the elements essential to an effective transformation process. Using this understanding, we created the Container Solutions Cloud Migration Maturity Matrix, a unique tool to understand where your company is now -- so we can get you to where you want to be.

Thus, the very first thing we do is take a client through the CS Maturity Matrix. This creates an accurate snapshot of an enterprise along nine different axes. We use it to define, analyse and describe organisational status and then validate the migration process. Constantly re-assessed as things progress, the data allows us to customise transformation goals and monitor progress while working to keep all points aligned.

Find out how you are doing: 

How do we assess alignment? Drawing a literal line through each stage’s current status point, from culture to infrastructure, gives instant and invaluable feedback. The goal is to have the line as flat as possible, moving the migration forward with a consistent front line. Graphing status in this way -- in the sample Maturity Matrix above, for example, Culture has progressed somewhat past Waterfall, while Process has nearly reached Agile -- provides a powerful visual of a company’s state. Which we then use to analyse, via the theory of constraints (TOC) paradigm, whether there are any axes acting as potential bottlenecks for the entire system.

TOC embodies the understanding that a chain is no stronger than its weakest link; thus, any system can be limited in attaining its goals by a mere few constraints. Even a single weak point can slow down, damage, or even break an entire process. The Maturity Matrix allows us to identify the worst bottleneck and immediately begin to increase the flow there, which in turn immediately increases the throughput of the entire system. (Conversely it also prevents wasting time and resources improving a non-bottleneck axis item whose enhancement simply will not open up throughput, at least at that point in the transformation process. For example, speeding up the delivery axis while not addressing other areas is not a good idea).

This does not mean, however, that all other teams working on other axes must remain on hold while a bottleneck is addressed. Aligning the Maturity Matrix does not mean moving in lockstep. It’s OK for different teams to progress at different rates, especially if some of these teams are preparing the ground for an easier transition. Transitions typically progress in a gradual rate, and teams that join in the process at a later point can in fact learn from the teams that were able to start experimenting earlier.

More important  is that these transitional actions need to happen in the correct order, as addressed in our previous article outlining the six steps of a successful cloud migration.

Also integral to this process is Conway’s Law, a concept at the heart of DevOps which essentially states that system architecture will come to resemble the structure of the containing enterprise organisation. Thus we also consider alignment context with Conway’s Law in mind, looking to have the goal for each individual axis align with the overall enterprise goals.

Staying in Sync

So, ultimately, ‘alignment’ does not mean maintaining an even line during the transition. Perhaps the better term would be ‘staying in sync’: making sure that each of the axes are adequately and appropriately addressed, and the transition is comprehensive. The objective is to address the entire complex system holistically and in context.

In fact, however, it is very unusual for a company to have that even, vertical line. Going in most have more of a mountain range than a flat plane. This happens because people tend to work harder on the areas that are already fairly advanced, a result of what is known as the ‘streetlight effect’. This refers to the phenomenon where  people only search for something in the places where it is easiest to look. It translates to teams and alignment because organisations typically have more motivated people working in those areas that are the most advanced. In other words, people try to improve the things that are already working the best in their organisation.

To get things in sync, Container Solutions uses the Maturity Matrix to identify areas that are the most far behind and prioritise them. Working to move those forward into alignment does in fact move the entire migration forward.

The Maturity Matrix: Nine Axis of Understanding  

In summary, each of these axes matter, both individually and as members of an integral system. Let’s take a look at the nine individual axes of the Maturity Matrix across their possible stages.

Culture

Culture is the most common “out of sync” axis. In fact, the culture axis is the most difficult one to progress on the whole Maturity Matrix. Culture is abstract, hard to transform, and it is a slow process. The other axes are faster and more doable because, ultimately, they are just code and planning. Changing culture also requires a lot of buy-in across the entire organisation, while the other axes can function in a fairly independent way.

Cloud Native software is distributed by definition, and requires distributed and decentralised culture and management. Hierarchical command and control is a good fit for building monoliths but microservices require a different approach. If your board is still used to dictating ‘This feature in January and this in March and this in August’ and expects to see exactly those features on that schedule -- they are going to be upset. They simply won’t understand why you’re doing what you’re doing when you undertake a more reactive approach. The opposite is also possible, where the board may want to iterate rapidly but your tech team is working within a traditional, linear, Waterfall model.

The important point is consistency. There needs to be coherent culture throughout the organisation or there will be conflict and frustration.

Product/Service design

This is where we look at what you do, and how you go about doing it. Is your process ‘Do X in January, Y in May, and Z in September’ with no deviation? Then you’re Feature-driven.  Or are you Functional or even Market-driven, giving customers Z in February even though it wasn’t planned until September? Market-driven is where you need to be for Cloud Native.

Or maybe you have no process at all, which is Arbitrary; your CEO just says “DO THIS” and that is what gets done. Even big businesses can have this, and it’s especially common with founder-driven organisations.

No matter where your current product or service design stage happens to be, we will be designing with the ‘next’ in mind. Putting in place the foundation to eventually adopt strategic and data-driven processes, building in readiness for next-wave technologies before they even fully emerge.  

Team

The old way was to be top down, do what the boss says. This is Hierarchical organisation or even no organisation at all. Not only is this very slow, but also difficult to return feedback up the chain. When all decisions must be made or approved according to the hierarchy, then the managers become the bottleneck -- an unnecessary block to progress. One reason is that in a hierarchy, managers are overloaded with work. The true damage, however,  is how this ultimately limits the creativity of the team: managers typically are not ready to approve something they cannot envision themselves.

Now we mostly see Cross functional teams, where you have a designer, a UI expert, someone with server experience all yoked together. Which evolves forward into DevOps/SRE, where everybody can do everything, a bit -- say, your designer can also provision a machine. Cloud Native gives you that ability, which in a perfect world is key to making things go so much faster.

Having this kind of decentralised and self-sufficient team structure makes sense. If we are always improving our systems in response to changing strategies and customer/market demand, shouldn’t we have the same mindset for teams? Realistically, however, it is difficult to expect every member of the team to understand all the aspects of infrastructure, programming, design and other disciplines. Specialisation is an essential part of economy of scale.

The best way to approach this is to insert the right level of abstraction between the teams and to create APIs for fast and effective collaboration. For example, when a team is using a public cloud, they don't need to fully understand the internals of the cloud -- only the API, which is fully automated. It is very important, though, for developers to acquire the right knowledge and use the right tools to build distributed software, as most of the software that is built today is distributed.

For us this typically means  that separation between the dev and ops teams happens on the level of Kubernetes interfaces. Developers need to know how to deploy and maintain distributed applications of Kubernetes. Ops need to know how to provision and maintain Kubernetes itself, and also everything else below and around it that creates consistent Cloud Native infrastructure.

The future looks toward what Skyscanner’s CTO calls ‘radical autonomy’ - bottom-up and delegation focused. But when giving a lot of autonomy, you also need to have lots of monitoring and firewalling. This makes things not as radical as they would seem; you are still giving everyone full autonomy over their own process, but isolating each from what everyone else is doing. Reducing the blast radius if they screw up.  

A common conflict to anticipate on the Team axis is freedom of choice vs. standardisation. Teams naturally want to choose the best tools for the job, but in enterprise-level microservices this may lead to chaos: too many tools doing the same thing will be impossible to maintain effectively. The flip side, however, is that fully controlled standardisation discourages people from taking initiative and exploring innovation.

A problem we see related to this axis happens when Team is way off to the right, but Architecture is way off to the left on a tightly coupled monolith. That is when we work to catch up Architecture as fast as we can.

Process

This is where we look at whether your enterprise does planning up front, and then execution.  Or do you change things responsively and on the fly? Right now almost everybody is in Scrum/Kanban as the go-to for process at the current moment. Scrum is like a waterfall, but with iteration down to 1-2 weeks instead of 6-12 months. This is excellent progress, but still a bit too slow for Cloud Native, though.

Cloud Native and CI/CD require the next jump in speed: Now developers need to be able to deliver every day -- and do so independently from other developers.

Also, Scrum requires coordinated development and delivery since it was originally built for Agile teams guiding monolithic products. The intent was to allow delivery of value in smaller chunks while protecting the teams from unwanted interrupts. In the Cloud Native world, though, the coordination between team members is way more flexible and responsive, and the speed of change is higher.

The Cloud Native process itself is still being fleshed out, because it’s still so emergent. This is not yet a beaten path so much as one that’s being actively created by the people walking it. What we do know, however, is that it’s critical to make sure that the foundation is right and the system architecture can support future changes. It is very important, especially in the realm of microservices, to have actual architecture and development guidelines. Otherwise, chaos will become unavoidable.

Architecture

This is where we take a look at whether your enterprise is trying ‘batteries included’ to provide everything needed for most use cases - the Tightly Coupled Monolith.  Or perhaps the next step in the evolutionary architecture chain, Client-Server. The Cloud Native goal is of course to be Microservices driven.

Very commonly,  people start with a single server, single service box split, and split again, into smaller and smaller and smaller functions. This is usually beneficial because it reduces blast radius while increasing scalability. However, it does take time to maintain and time to monitor. And potentially longer deployment. Key here is automation;  unless you automate everything it will swamp you. In rare cases we do find that a company comes to us with a very good level of automation already in place, but usually the level of automation is significantly lower.

Maintenance

Assessing how you monitor these systems.  Ad hoc means every now and then going in to see if server is up, what the response time is. And the somewhat embarrassing fact is a lot of folks still do that. But it is nowhere near fast enough for this new world, nor really is Alerting.

Comprehensive monitoring is an absolute necessity for Cloud Native. As things become more more complex you need cleverer monitoring that is more reactive and that looks after itself -- that filters out alerts you don’t need to know about and alerts you to the ones you really do. Looking forward to Next come the tools that allow it to even start to become self-healing. These tools are already here, and solid, but at this time they are not at all easy to use.

Delivery

Delivery is really all about how quickly you can get things out, and in how automated a fashion. The Maturity Matrix moves from traditional Major Version releases to Monthly to Continuous Integration and finally to Continuous Delivery.

This axis exemplifies the importance of performing transition steps in the proper sequence:  with delivery, the order of introduction for CI/CD with respect to other axes is particularly vital. First, it’s important to do move to CI/CD as early as possible in a cloud migration, as this gives a very good base for quick introduction of changes and for experimentation. However,  introducing CI/CD prior to microservices will dramatically slow the transition down. Furthermore, it’s important to have in place CI/CD capabilities in place prior to the creation of the first microservices: otherwise, delivery and maintenance of microservices becomes unmanageable. Worse, in many cases it will also lead to creating an inherently unhealthy system.

The goal is a combination of speed and automation getting your code into production. Every six months to every month, every week, every day, to every 10 minutes as you move from left to right across the Maturity Matrix…

Provisioning

This axis is all about how you control your infrastructure.  How you create new infrastructure and new machines, how quickly can you deploy everything, and how automated it is. As you move to the right, your company does want to be leading here. (That said, Kubernetes seems perfect -- but if this axis gets ahead of Delivery or Maintenance then you’re screwed).

Infrastructure

Everyone knows this one: Single server to Multiple servers to VMs running in your own data center. Then shifting to Hybrid cloud for a computing environment that mixes on-premises infrastructure with private and/or public cloud services best tailored to your company’s specific needs and use case.

A problem scenario to be anticipated/avoided here is VM configuration. VM setup needs to be fully automatic or it will lead to constant fighting between the infrastructure and the development teams. The result will be either massive over-provisioning -- which is completely counter to Cloud Native principles -- or shadow IT, where the internal infra team ends up sidelined. Meaning that the dev teams will typically move to a public cloud by using a dev department credit card and without any coordination with the infra.

In larger companies in particular we constantly discover the problematic state where multiple teams have taken their work to public clouds, even different ones. While they may do this with the best of intentions, they each do it independently, without coordination -- and without considering things like security, compliance and other operational concerns. Going down this road is a disaster for an enterprise.

The more effort invested into those individual systems, the more difficult it becomes to refactor them into a consistent enterprise-wide, infrastructure environment.

The Axes, United

When an enterprise organisation looks to transform itself into a Cloud Native entity, the transformation must be firmly rooted in understanding. If something can be measured, it can be understood. Then once understood, it can be managed. The novelty of continuous improvement has already worn off --  it’s no longer a competitive advantage but a requirement for survival.

The Container Solutions Maturity Matrix is calibrated not only to optimise enterprise transition to current Cloud Native technology. It is also carefully designed to help clients intelligently set the stage for whatever comes next.