As you complete each project as a consultant, you start noticing patterns. No matter what the differences are between the various architectures, tech stacks and teams, you find yourself following the same steps.
This is especially true when we start talking about concepts like cloud migrations or Cloud Native transformations. Often, I will sit with a large sales team selling clients a magical silver bullet, or some optimised methodology that will no doubt have them in the cloud in no time. Don’t get me wrong, there is at least some truth in a lot of these methodologies and systems that have been built to help companies become more Cloud Native.
There are also, however, much simpler solutions. This is a small playbook on steps I have followed countless times to get companies more Agile in the cloud.
One of the most underrated yet effective steps is simplicity, It might come as a shock, but often the most scalable and effective systems are also the simplest. When moving to the cloud you will often hear the phrase 'lift and shift’. It is sold as the way to leverage ‘quick’ wins in the cloud.
From an engineering point of view, all I see is the fact that companies that jump on this approach take their legacy problems and label them as Cloud Native. All those workarounds and quick fixes that you did for that server running in the basement, are now running on a shiny EC2 instance. Yes, you will get in the cloud, but you will have no— or, at best ,very limited gains from this move.
So what do I mean by simplicity? Well, quite simply (excuse the pun), it means having a hard look at your system, looking for the smallest, most non-complex components and defining a roadmap on how to leverage the cloud for them.
There are a couple of upsides to taking this approach. You will learn important lessons working on something simple and, as you start migrating the more complex components, you will have a better understanding of what works and what the pitfalls are.
In addition, showing value quickly is important, as often a company will be on the fence about cloud, and you will have limited budget and/or limited input. If you can show value upfront, the interest in and viability of the project grows within the company. You will show value faster, as moving a simple piece that is useful will take vastly less time than moving a large, complex piece.
Building on this idea of simplicity, I want to introduce a term that is common in a lot of transformations/migrations, ‘low-hanging fruit’. This is an obvious idea but hardly ever utilised.
In a complex system, there are always components that are more Cloud Native than others. Have a look at your containerised apps, those async workers where latency is not a problem, or any less production-critical workloads. You can use these parts of an application to pave the way for more complex components.
We often go in with the idea of no failure. This is not the point, you will encounter some failures in any migration. The important thing is to fail early and learn, rather than fail late when there is less room to maneuver.
This is easier to do with components that are less sensitive or less critical. In terms of identifying these components, I’ve found it often helps to diagram your current system and break down the components on a chart that references the amount of effort required, and the impact of moving that specific component.
Being an industry as full of opinions and ideas as technology is, there are a lot of words that you will hear often: Kuberenetes, microservices, containers, CI/CD to name a few. It's often difficult to keep up.
The suggestion here is simple: Don’t.
No single technology is a silver bullet. No matter what the sales guy tells you, you have a better understanding of your needs than anybody else does. So when looking at your Cloud Native transformation, start with your current tech stack. Have a look at the needs that have prompted you to explore this option—i.e we need to scale, we need to deploy faster, we need better insight, etc. See if any of the suggestions fulfill those needs. Work from the simple principle of, if it does not add value, do not add it.
Value in starting with the cloud is generally going to be dictated by the business. Your organisation’s leaders may not be sure of the merits of a cloud migration in general, so they will need convincing. This, in turn, means that your current bar is the system you already have, along with orthogonal concerns such as the cost and speed of deployment.
If you are offering the business anything less than what your organisation already has, then the move to the cloud will seem like a backwards step to your leaders. The way to address this is to add as little change as possible (there will still need to be some change) while utilizing the power of the cloud.
Once your system is in the cloud, you are free to play Buzzword Bingo to your heart's content. However, I would suggest that you stick to the ideology of what's not working needs to go.
This cannot be stressed enough. Working for a consultancy, more often than not you are put into teams that have already tried cloud migrations and failed. In the worst-case scenario, the people that created these migrations are no longer with the company, but the company has sunk X amount of cost into this endeavour and is not willing to forfeit that expense.
Now, the honest truth behind this is that the reason these endeavours have failed is generally due to them not being structured in the right way for the company. Oftentimes, these are big-bang approaches, or the requirements for using the new system are too strict, which tends to result in a half-migrated system.
The best advice here is cut your losses; nobody wants to own a half-built ship. That’s not to say that the endeavours were completely useless. Take the lessons learned from the systems that did not work and architect the new system with those lessons in mind.
Iteration is key. When you meet resistance in your new architecture and something is not working, take a step back, pivot, and try a new approach. This does not mean that you should redesign the whole thing every time. Try to keep things modular, so the need will be to only redesign a single part with each step back.
Last but definitely not least, I would suggest getting defined standards in place as early as you can. There are two levels of standards. Within the company there should be standards, i.e naming conventions, tech stack, etc. However the more important standards are the community standards.
When looking at the technology industry you tend to notice trends in specific methodologies, tech stacks, operational models, and so on. There is a reason for this—put bluntly, it’s a numbers game. When you get a large number of the smartest engineers in the world working together, you tend to get the most optimised solutions.
This is what the open source community offers. Learn from them, follow their practices, and even if every bone in your body says you can do it better, do not reinvent the wheel.
Why? Well, put simply, it’s easier to scale an engineering team this way. If you follow community practices and standards, then you can almost expect the engineers you hire to be aware of them and even have experience in them.
The alternative to this is that you have a completely custom setup that each new engineer needs to learn and become experienced in. This also doesn’t really help their careers and this may, in turn, make future hiring more difficult.
These steps have helped in countless cloud migrations and Cloud Native transformations and while they seem to be obvious, they are surprisingly hard to implement.
The last bit of advice is this: There is no such thing as a silver bullet. Even this list will not hold up in every circumstance. Take what you need and leave out the rest.