Every once in a while, a global development initiates a chain reaction of disruptive events across many industries and markets. Such disruptions can be caused by people, organisations, the market, luck, or simply nature. In most cases, we cannot anticipate when and how they will occur, but history has taught us that it is just a matter of time. The world is obviously experiencing such an event now: the COVID-19 pandemic.
Bob Dylan wrote ‘The Times They Are a-Changin'’, suggesting that one needs to make peace with the fact that things might never be the same again. This feels like a great analogy to what both the world of business, and the world of Cloud Native, are currently experiencing.
Before the pandemic, the perception of Cloud Native revolved around the idea of modernising the infrastructure in pursuit of greater efficiency, faster time to market, and cost optimisation. The first wave of the pandemic greatly increased the urgency of adopting Cloud Native, as companies suddenly faced an existential threat.
Many businesses faced scalability and availability issues, as the shift to digital services and e-commerce platforms caused a constantly increasing demand for remote-friendly services. This new reality forced the industry to adapt as fast as possible, which led to the realisation that not only will the world not return to the old ways after the pandemic, but we all need to build future-proof and resilient organisations.
This blog post will dive into the reasons why going Cloud Native is the way forward to cope with such global disruptions and, more importantly, how such developments provide a great opportunity to build an organisation that is resilient enough to withstand the test of time.
Cloud Native to the Rescue
Let’s talk about today first. The new reality that emerged when the pandemic hit required many industries to cope with issues related to scalability, availability, and delivery of new features or systems. Adopting Cloud Native strategy, processes, and technologies addresses exactly those pain points, which explains why those that had already embraced Cloud Native were already better positioned in the market.
One example is the engineering team for remote learning platform Khan Academy, whose leaders have described how they scaled to handle 2.5x their normal traffic in a week with usage reaching 30 million people in April. Marta Kosarchyn, CTO and VP of engineering at Khan, says that two key components of Khan’s infrastructure—Google App Engine and Fastly CDN— were instrumental in ensuring that its platform could scale in response to rising demands.
Another example would be Texas-based retailer H-E-B. In a fascinating talk at Chaos Conf 2020 in October, Justin Turner, a senior software engineering manager in Digital Fulfillment at H-E-B, described how the retailer rapidly moved from a proof of concept for curbside and home delivery of groceries following a successful Cloud Native migration.
A Matter of Scale
From an IT perspective, a critical problem that many organisations faced when the pandemic hit is the difficulty of scaling IT infrastructure in a confident and timely manner. A simple definition of the term ‘scalability’ is the ability to increase or decrease IT resources to meet changing demand.
During the pandemic demand increased exponentially and chaotically, which caused uncertainty regarding the real scaling capabilities of IT systems. Traditional companies purchased more physical servers and tried to solve the problem by scaling their data centres horizontally. This might provide temporary relief, but it also creates way more operating cost and a large amount of unused resources when traffic levels return to normal.
Even when Cloud Native technologies, like containers and Kubernetes, are used under the hood, horizontal scaling only becomes a real possibility when the architecture and services themselves support it. In the Cloud Native world, there is a popular saying: ‘the cloud is just someone else’s computer’. This implies that the ability to scale goes beyond the kind of underlying infrastructure and involves adopting the right processes and architecture patterns.
A great example of this is Netflix. The domino effect of country lockdowns that governments imposed in March and April 2020 created a considerable surge in new users and streaming services demand. Netflix itself reported the stunning addition of 15.77 million paid subscribers —more than double its pre-pandemic forecast.
The company is practically the poster child for Cloud Native infrastructure and it coped with the new reality without major problems. Furthermore, Netflix used the situation to its own advantage to boost its business as its stock price skyrocketed.
Another argument that is frequently used to highlight the advantage of adopting Cloud Native is cost optimisation. After the pandemic hit, many markets experienced a sudden decrease in their revenue. This effect was particularly dramatic for businesses that rely on physical services or infrastructure. Cost optimisation became a critical KPI that in many cases determined the survival of the business itself.
Using Cloud Native technologies, an organisation can make an efficient use of its IT resources, which typically translates into less operational cost and eventually more revenue. Optimising the operating expenditure (or OpEx) is facilitated not only by directly using scalable infrastructure to cope with dynamic demand, but also through adopting the required mindset and processes that eventually lead to less downtime, faster recovery, and fewer incidents.
Optimising costs are tightly coupled with leveraging scalability, as mentioned in the previous section, and implementing highly available and multi-region architectural patterns. This is an area where Cloud Native particularly shines, but is definitely not an easy feat. It involves not only a shift in the tools and technologies used but also the required mindset change to simplify and optimise the existing processes.
A metric that became even more key for many businesses during the pandemic is time-to-market. COVID-19 has not only increased the use of digital services but, even more importantly, has created new requirements and use cases.
As in our H-E-B example from earlier, a grocery store may need to start supporting home deliveries of its products, which means providing options to users for flexible time slots, frequently updated lists of available products, route optimisation for the delivery vans, etc.
These new features need to go from conception to production in a matter of days without jeopardising their quality, as many of these services are directly linked to customers’ quality of life during these difficult times. Some organisations, such as Netflix, are used to moving at such speeds. However this is not normal for many other industries.
Another aspect of increasing agility is reducing waste, which is typically addressed by following Lean principles to optimise the existing processes. Cloud Native extends beyond the use of particular technologies and involves the optimisation of processes like Continuous Integration and Continuous Delivery while trying to achieve higher degrees of automation. Adopting such practices can radically improve time-to-market and provides the organisation with the necessary agility and flexibility to respond to change.
Playing the Infinite Game
In The Infinite Game, Simon Sinek elaborates on how successful businesses have managed to withstand the test of time and create long-lasting brands. What I found interesting is the realisation that these organisations do not play with a finite mindset, meaning that they are not only concerned with what is in front of them.
They are working towards creating a unique identity that is based on resilience and continuous improvement. They are neither obsessed with being the first among their competitors every single day, nor with having the best technology or tools. They are focused on becoming a better version of themselves and on creating products or services that are user-centric and satisfy a particular need.
Apple is a great example of this. Back in 1997, Steve Jobs famously said, ‘You’ve got to start with the customer experience and work back toward the technology—not the other way around’. This shift in mindset requires brave and decisive action for change. Having this mindset is the first part. The second part is using the right strategy, tools and processes—and this is where Cloud Native comes into play.
When we talk about resilience in a software engineering context, we usually refer to the ability of the system to recover from a fault and withstand stress whilst performing its core functions. We want to make sure that our applications and systems are resilient enough to withstand heavy traffic, unexpected errors, outages, and so on.
However system resilience is hard to achieve in today’s complex IT environments. Organisations end up increasing the complexity of their systems through a never-ending cycle of adding more functionality on top of an already complicated system, as the pressure to deliver fast leaves little room to take a step back and think of more resilient and fault-tolerant architectures.
What an infinite mindset tells us, though, is that business is a marathon, not a sprint. And to endure a marathon, you need more than system resilience.
I like to believe that an organisation is first and foremost its people. Building resilience across the company means that people become confident in managing their systems and, more importantly, they stay calm in the face of the unknown.
Going through such a transformation process, trying to make efficient use of the cloud, will reveal several weaknesses and possibly create technical difficulties. At the end, however, people will feel not only confident about the performance of their applications but also about their own level of knowledge and control. And more importantly, they will have built resilience through this process.
Turning Disaster into Opportunity
The oil baron John D. Rockefeller once said, ‘I always tried to turn every disaster into an opportunity’. A finite mindset will look at the disaster and try to get through it as unscathed as possible. This is a natural human response in the face of danger and is completely understandable and sensible. History has shown us, though, that those who have the ability to turn such cases into opportunities are the ones setting themselves up for greater success.
In the Cloud Native world, this opportunity doesn't just come from migrating applications to the cloud. It doesn’t even have to do with what is migrated or how this migration happens. The accompanying culture shift from a fully realised Cloud Naitve migration enables a company to leverage all the opportunities for innovation that Cloud Native technologies provide: things like Edge, 5G, and IoT, for example.
Maybe even today these terms are still considered by many to be buzzwords. But the pandemic has raised new requirements and increased the demand for new features that extend beyond conventional, well-established solutions.
Innovation has always existed. What changes are techniques and tools that facilitate innovation. A global development, such as the spread of COVID-19, has already created numerous opportunities and exposed many gaps that are waiting to be filled. This is the time for new ideas, for radical approaches, and investments in building not only resilient systems but also innovative ones.
This, in turn, builds a problem-solving mindset which is centered on curiosity and determination, two qualities that, when employed together, lead to finding solutions to the most difficult problems.
Sustainability Is Not Optional Anymore
This current crisis will eventually pass, but it would be a mistake to think that we should then expect to see a return to business as usual. In the same way that the pandemic creates opportunities for businesses, it also creates a once in a lifetime opportunity for legislators to make radical shifts. Climate change should be top of the list and, as our Tech Ethicist Anne Currie recently argued on WTF, we should exert pressure on our cloud providers to ensure they play their part.
Many organisations already have sustainability KPIs or goals. This is a natural consequence of governments imposing more and more restrictions on more and more industries in a global, synchronised effort to slow climate change down.
On the InfoQ podcast, Anne observed that electricity represents a relatively easy win, and, in the past few years, we have seen a significant effort by many cloud providers to reduce or even entirely eliminate their carbon footprint.
The effort was initially led by Microsoft and Google, both of whom are Carbon Neutral and aim to be Carbon Zero by 2030. Amazon, whilst only carbon neutral in four regions, has also announced a 2030 goal for Carbon Zero. They will hopefully be followed by others. Moreover, new cloud providers, such as LeafCloud, are joining the market and trying to push this vision forward. These very encouraging developments put Cloud Native at the forefront of the sustainability battle for two main reasons:
Firstly, as we discussed above, scalability and cost optimisation are some of the widely discussed benefits of Cloud Native technologies. This makes sense, as it means making smart resource utilisation and scaling on demand, which in turn leads to consuming fewer natural resources. As a developer, making as much use as you can of the resources your cloud provider offers makes both engineering and environmental sense.
Secondly, the commitment of many cloud providers to operate their data centres almost entirely on renewable energy gives an additional incentive, as the carbon footprint can be minimised even more.
There are several well-established reasons why cloud computing in general is the way to achieve growth, and they are all still valid. Through this pandemic though, new opportunities have emerged and there will be no better time to leverage them than now.
Playing the infinite game means looking at how to stay in the game the longest without being affected by external factors. This means making brave steps to build organisational resilience, constantly seeking to innovate, and preparing for the future by adopting sustainable solutions that will be essential through future decades, whilst the necessary availability, scalability, and cost optimisation are achieved to ensure not just the survival of the organisation but also its growth and adaptability.
Cloud Native can be that force of change and can provide all the needed tools and processes. It is time to take this disaster and turn it into an opportunity, in order to start playing the long game. The journey will not be easy, but it is certainly worth embarking on.