Last week I published a case study of Starling Bank, one of the UK's mobile-only challenger banks who famously "built a bank in a year" using Cloud infrastructure and services. The case study is part of the new content in our v2 of The Cloud Native Attitude, which we'll be launching at KubeCon in May. Come and see us at our stand!
In this post I want to zoom in on one aspect of the Starling architecture that for me is highly characteristic of banks: eventual consistency.
Classic Design Trade-offs
Stepping back, in a microservice architecture there are always design trade-offs that you have to make. Which design you choose depends on what kind of end-to-end service you are offering to your users.
One of the more fundamental decisions you make is whether you’ll choose “eventual consistency” for data in your system or “strong consistency”. What do I mean by this?
In any distributed system (e.g. a microservices one) there will be multiple copies of any piece of data. As a thought experiment, let’s consider how we might design the data management in a simple banking system (this is not necessarily how Starling works!) Consider the current balance of your bank account. You may have a cached value for your balance in your mobile app, which you are looking at right now. There will also be a value in the main database in your bank’s back-end systems.
If we design a simple banking system, we need to make a decision about how much effort we’ll go to to make these bank balance numbers consistent. If we decide the number must always be the same everywhere (e.g. the balance shown on your phone and saved in the back-end DB are always identical) that’s called a “strongly consistent” system. That sounds right! What could go wrong?
The downside of this approach is the performance of strongly consistent systems can suck. That can make the system less useful. What if your phone loses contact with the server? In a strongly consistent world it won’t be able to show you your balance, even if it was only updated from the server 5 mins ago and it’s 99.99% likely to be accurate. That is still vastly better than the paper statements from our banks we used to rely on, which might have taken several days to reach us.
Most users find the requirement for strong consistency annoyingly slow. It’s even worse in a system where individual operations can take a long time to complete. Transactions get queued behind one another and everything grinds to a halt. You might find your balance query stuck waiting for the successful completion of a card transaction that involves third parties and could take minutes or even hours to finish!
A common alternative is called “eventual consistency”. In eventual consistency copies of data don’t always have to be identical as long as they are designed to eventually become consistent once all current operations have been processed. If you were happy with eventual consistency then your phone app might show you your last balance (if it’s fresh enough to be reasonably reliable) even if the accounts DB server is not available to query right at this very moment.
Eventual consistency is actually the traditional approach taken for transactions in banking. Banking operations have historically often involved steps that might take a variable time to complete or include third parties. Consider cheques (or checks in the US). I can take out my checkbook and immediately write and spend $1M in cheques with retailers who are willing to take the risk of accepting them. Unfortunately, the cheques will “bounce” (the bank won’t honour them) when eventual consistency is reached, usually some days later, and everyone realises I didn’t have $1M to spend. In this case, the retailers will lose out. The banking system doesn’t stop me issuing bad cheques. Society, the law and my ability to balance a checkbook do.
Why Our Modern Lives Depend On Eventual Consistency
The banks, retailers and our economic system in general have traditionally been comfortable with eventual consistency, backed up with
- a culture of regular reconciliation
- tricks to improve perceived performance
- management and charges for bad behaviour (like overdrafts and, finally, legal proceedings).
That’s because the number of effectively honest folk outweighs the dishonest ones, particularly in a country with good rule of law. Establishing this form of trust, and self-management by individuals, is extremely valuable to society and commerce. Without it, commercial exchanges are slow, or expensive, or both.
Note - I noticed at a distributed systems conference recently that several folk used banking applications to illustrate the need for strong consistency. “What if someone deposited money at time T, then bought something at T+5 and the money wasn’t there!!” The irony is that this happens all the time. Banking systems have been coping with these kinds of issues for very many years using several tricks of eventual consistency. For example, if you have an overdraft facility the purchase may go ahead using that. If the money then arrives in your account within a reasonable time, you will not be charged for the overdraft or see it on your bank statement. In that case the system appears highly performant to the user. There is always some risk (that you’ll never pay off your overdraft, for example) but banking is full of risks like that, which just need to be managed.
Trust and eventual consistency are currently part of the bedrock of society. We should think very hard before we reject either.
Read more about our work in The Cloud Native Attitude.