WTF Is Cloud Native, Cloud Native Operations, CNO

WTF Is Continuous Improvement?

When you’re offered a Covid-19 vaccine this year, which sort would you like? One that’s been through animal and human trials, received government approval, is made on a standardised production line, and is administered in two precise doses, a specified time apart?

Or would you prefer something more ‘innovative’: a vaccine whose formula is tweaked weekly, and which is made on a production line based on open-source designs? Yes, it’s being continually updated, but if there are any problems, we can always roll back to the previous version. Can’t we?

It’s an extreme example, but I’ve been thinking a lot about continuous improvement and continuous innovation recently. People sometimes think they’re two sides of the same coin. But they’re not. They’re actually different, distinct stages in the development process. Different disciplines, focused on different outcomes. 

In business, tension often exists between leadership and management, brilliantly described in this talk from Nick CaldwellManagers need to care about predictability and stability, and the key role is to try to create reliable outcomes.  Leaders are trying to spot the missing opportunity that everyone overlooked and pull you off a predictable course.     

There’s a similar tension between Dev and Operations teams, a tension the DevOps and SRE movements have been trying to resolve. Ops want to keep things working, to keep things running and stable, and keep producing exactly what they’re supposed to produce, consistently. Devs want to go fast, so of course they’ll break things. 

That’s great for innovation, for developing completely new ideas. But that approach can be pretty catastrophic when getting something new into production—or, even worse, reengineering something that’s already in production and making you money.

Innovation Alone Won't Work

What on earth does this have to do with Cloud Native? People often say they want to go Cloud Native because they want to be more innovative. Maybe they’ve already run some pilots, brought in some experts, worked through some false starts, so they may really feel like they’re getting somewhere. They might even send someone to speak at a conference about the incredible journey they’ve embarked on.

But at some point, they’ll want to switch into production. But just because they want to, it doesn’t mean they should. 

If they’ve not asked the right questions, right at the beginning of the process, this is the point when they don’t just break experimental products and processes, but they might break stuff in production. And that, potentially, means breaking things that actually work perfectly well and earn you revenue. No one wants to tell that story at a conference.

Don’t get me wrong. I like Cloud Native. But there are bigger questions here. Cloud Native can be a great enabler for innovation, but innovation alone doesn’t keep you in business. You need a more holistic view of what it is you’re doing.

Traditionally, a product—a car, washing machine, a tank, or a vaccine—took years to develop. Development from the concept stage could be a painstaking process. Once it hit the market, you couldn’t really change it. In fact, you didn’t want to because if you’d got everything right, you had years to sell the product and make money. And it also gave you lots of time to start innovating for your next product.

No Innovation in Production

This was certainly how things have worked in the pharmaceutical world. At least until Covid-19—which, on the surface of it, has upended everything we know about time scales for developing vaccines.

Over the course of this year, we’ve seen a massive innovation effort, with companies, universities and governments making multiple bets on multiple vaccine candidates. For the pharma world, the speed at which this has happened is incredible. Lots of budget has been pushed towards R&D teams, lots of ideas and approaches have been tried. Some of these have moved beyond the experimental stage and into more focused research trials, and some of those have been tested on humans.

The result is that we’ve had at least three vaccines emerge.  Some still have to go through approvals processes, testing, and authorisation in some countries, but manufacturing has already begun.

So what happens now? As the virus mutates would you be comfortable that someone was making changes to the vaccine on the fly? Or tweaking the production process, just to see what happens?

No, for each vaccine, there’ll be one formula. You’re in production, and you have clever operations people making sure the production process is highly stable and high quality. You might improve manufacturing processes, and delivery and distribution. But these are improvements, not innovation.  Any tweaks to the formula that do become necessary as a result of changes to the virus will go back through the approvals process.

Laying the Groundwork

So the question is, how do you start with speed and innovation and gradually—or suddenly—switch into quality and stability?

You need to understand it’s a funnel, or a cycle, or whatever metaphor you want to call it. But you have your crazy, innovative developers working on a whole bunch of crazy ideas at one end, building the minimum viable products (MVPs) that demonstrate the possibilities of whatever sorts of things you want to build.

You have people in the middle taking the most promising of these ideas, optimising them, getting some ready for production—or dropping them, if things don’t add up. Finally, some products are moved into production, and these are the ones that make you money, or are going to make you money in the future, so you can fund more researchers or developers. And that’s where it’s about constant optimisation and improvement. It’s not about innovation, not in this phase.

And when we look at the race for a Covid vaccine, it’s easy to forget it was built on existing models and foundations. Some research on coronaviruses was already in motion before the pandemic. And even as research scientists were making multiple bets on possible vaccines, pharmaceutical firms, governments, even the Bill and Melinda Gates Foundation were laying the groundwork for testing and manufacturing. Yes, the regulatory process was accelerated, but it’s still the same process all vaccines need to clear. It’s because these foundations were already in place, and understood, that millions of people will be getting vaccinated this year. 

Building for Improvement

Another way to look at it is to look at my house, which is something I’ve been doing a lot while we’ve all been waiting for a Covid vaccine. Here it is. It’s a typical Dutch house, in a row of attached houses. Each house is a rectangular slice of the row going up two full floors, with an attic on top.

pinis-house

Our house is about 50 to 60 years old and originally was about 80 to 85 square meters, but was built in a way that has made it very easy to adjust and extend. Ours has two roof extensions and a large extension into the back garden, which makes it around 125 square meters. Rooms have been altered, floors changed, a new bathroom added. All very affordable and very practical, allowing the original house to grow and incrementally adjust over its 50 or so years.

But now we have a problem. The kitchen is too small for us, as we would like to have an island in the middle. This would mean removing a supporting wall in the living room and replacing it with a supporting column. Basically, we’d have to rebuild the ground floor, which is a very expensive project.

In other words: The house was built decades ago without considering future adjustments.

The result is not just that we won’t do the renovation but, because we know we will probably eventually need to move to another house where we can have the kitchen we want, we are not investing in other improvements that are actually relatively easy and cheap. 

So, a decision made over 50 years ago to place an immovable wall in the middle of the house is now blocking change and growth. But worse, because the growth of the house is limited by that wall, the whole building is effectively declining in quality and usability, as no other improvements are being made to it.

A well-built, extendable building should support the needs of its residents for a very long time, but only if the original design can support totally unpredictable improvements a long time after the house is ‘finished’.

There are a few stages that apply to all houses and other buildings: design (architecture), construction, maintenance and post construction improvements. But better buildings will shift some of the investment from the original construction to ongoing improvements. Houses that are built to be perfect from day one will gradually degrade in usability, until they have to be demolished.

It’s the same in software. The original design and construction are the development of the software product. At that stage, it is very important to create a strong but flexible and accommodating foundation, so future improvements can be easily added later.

So, we have design by architects, development by Dev teams, and initial release of the product followed by maintenance by Ops teams, with ongoing improvement by Dev teams. It’s the Ops teams’ responsibility to continuously improve the basic platform, to allow easy extensions by pushing the Devs to improve and by making sure that the architecture remains flexible enough to allow for future growth.

The goal is an ongoing cycle that allows you to have smart architects, or Devs, or engineers working on the next big idea—or vaccine, or flexible building—while Ops improves what you already have.

It’s not about a magic bullet. It’s not about using particular tools or technologies. It’s about asking difficult questions of yourself right through the process—or recognising that you need to bring in someone else to ask those difficult questions or help you with the answers. Get that architecture right, then you can start thinking about how you transition from speed and innovation to quality and stability and continuous improvement.

It’s about recognizing that Dev people and Ops people are not enemies, but they’re seeing the same thing from different angles and from different stages of the funnel/cycle/whatever. If you have good people in your organisation already, they should be able to figure it out. And if you don’t, or they can’t, maybe it’s time to get some help.

Related Cloud Native Patterns

Dynamic Strategy

Today's tech-driven marketplace is a constantly shifting environment, no matter what business you are in—so your game plan needs to shift right along with it.

MVP Platform

Once early experiments have uncovered a probable path to success, build a simple version of a basic but fully functional and production-ready platform, with one to three small applications running on it in production.

New call-to-action

Previous Next →
New call-to-action New call-to-action

 

Comments
Leave your Comment