As part of the investigation for our new book "The Cloud Native Attitude" I interviewed a lot of expert practitioners to find out what they had tried, struggled with and achieved with Cloud Native. Below we're publishing the second of these interviews - a case study of the fashion ecommerce retailer ASOS. ASOS interests me because during the 00's I was Head of IT for an online fashion store (Figleaves.com). Like ASOS we built our own systems from scratch so I know how hard that is. I only wish we had the cloud back then!
All the interviews are in our book, which is currently available as a free eBook download.
Founded in 2000, ASOS is a highly successful, global eCommerce fashion retailer. Across their various mobile and web platforms they had 800 million visits in the first half of 2017. They have 21 million social media followers and their retail sales were just under £1B in H1 (the first half of) 2017. ASOS’s mission is to become “the world’s number-one online shopping destination for fashion-loving 20-somethings”.
Since their inception, ASOS have been tech visionaries who built their own platform in-house to meet their specific needs and resisted the urge felt by many retailers to go for off-the-shelf eCommerce products. To advance their overall objectives, ASOS have identified a number of strategic goals including: faster feature velocity (getting functionality from idea to user more quickly), improved scalability to handle peaks like Black Friday, and the ever-faster site response times that are key to online conversion. ASOS are in a slightly different technical position from our other case studies. Unlike Skyscanner and FT, ASOS’s services run on Windows not Linux.
Several years ago, ASOS determined that a key factor in achieving their goals would be to transition from on-premises, owned and self-managed servers to cloud hosting. They have been gradually moving all of their services from their own data centres to Azure with the stated aim of 100% Cloud within 2 years.
One of their aims from Cloud was to significantly reduce the operational load on their teams. They decided they would rather have their technical folk focussed on areas of greater business advantage, like features. ASOS therefore chose to run on Azure’s “Cloud Services” PaaS. I.e. they use fully managed VMs provided, monitored, patched and supported by Microsoft. ASOS just deploy applications to those VMs, using them as “units of isolation”. They found this did indeed reduce their operational overheads and so they went even further, transitioning to fully managed databases wherever possible (aka “database-as-a-service”). They now host or operate their own stateful services only when Azure does not offer a fully managed alternative.
As well as moving to the cloud, ASOS embraced a Cloud Native-style approach with heavy use of Microservices. A microservice-oriented architecture on flexible cloud infrastructure has been core to their improved feature velocity. However, they have sometimes chosen to prioritise other goals. Like FT and Skyscanner, ASOS create their microservices to “do one thing”. Unlike the other two however, in some instances ASOS group, link and deploy multiple microservices as a single process, with each microservice a Windows library.
The majority of the ASOS estate consists of the more usual discrete services communicating over (typically) REST, but ASOS group and deploy their services together where performance is particularly critical. This improves the responsiveness of these service groups, which is good, but the increased coupling does have a negative impact on ASOS’s agility and ability to change those particular services, which is bad. In most cases, ASOS choose to prioritise agility and feature velocity over performance and therefore they deploy using the more common single-decoupled-microservice model in the majority of cases.
Why is grouping services more performant anyway? As we discussed in an earlier chapter of the book, microservices talking across multiple VM instances can potentially introduce significant intercommunication latency. Grouping microservices can make deployment trickier and increase coupling, which slows feature velocity (time from idea to deployment). However, it can significantly improve execution speed, which ASOS judge to be an important priority for them for some services. According to their Enterprise Architect David Green “the ability to provide fast response times is key to our business”.
{Aside - for some Linux orchestrators you can achieve a similar result using the “Affinity” feature, which tells the orchestrator to make sure that some services are always co-located on the same VM instance or “node”).
This is a good demonstration that microservice experts still make judgements and balance tradeoffs on how they will implement a Cloud Native approach. This is true even on a service-by-service basis.
{Aside - interestingly in the ASOS architecture the majority (obviously not all!) of their service communication is with the (remote) client, so cross-VM latency isn't actually as big an issue as we might think for most of their services.}
Their interest in execution speed is also reflected in ASOS’s data architecture. They keep data as close to users as possible and make extensive use of NoSQL databases and caching. One of the attractions of a microservice architecture for ASOS is the ability to make more granular choices about how and where data is maintained, all of which helps with their critical response times.
Another, completely different, aspect of a microservice architecture that particularly appealed to ASOS was the ability to parallelise teams and reduce handovers and blockages. This also helped them improve their feature velocity.
ASOS are extremely happy with the progress they have made using Cloud and microservices. Last year’s huge Black Friday beat all previous records for scale and responsiveness across their applications. So, what next technically for ASOS? As Azure continues to add managed stateful services, ASOS will transition to use them. They would also like to improve their server density (effective resource utilisation). To achieve this they are likely to investigate containers and orchestration, but that tooling is still less mature on Windows than Linux.
Overall a cloud (PaaS) and microservice-focussed strategy has worked very well for ASOS and they intend to continue on their current path.
Read more about our work in The Cloud Native Attitude.