Cloud Native Blog - Container Solutions

How to Design an Internal Developer Platform

Written by Cari Liebenberg | Sep 9, 2022 2:35:01 PM

WTF is an IDP?

There are a couple of existing definitions that form a useful starting point when trying to define it. Thoughtworks, via Martin Fowler’s blog, gives us this definition of a digital platform:

A digital platform is a foundation of self-service APIs, tools, services, knowledge and support which are arranged as a compelling internal product. Autonomous delivery teams can make use of the platform to deliver product features at a higher pace, with reduced coordination.

And Humanitec, who sponsored our whitepaper “The Rise of the Internal Developer Platform”, has this definition of the Internal Developer Platform (IDP):

An Internal Developer Platform, or IDP, is a self-service layer that allows developers to interact independently with their organisation’s delivery setup, enabling them to self-serve environments, deployments, databases, logs and anything else they need to run their applications.

An internal developer platform (IDP) can be seen as a specific implementation of a digital platform, offering self-service functionality as a product to meet a variety of a team’s needs. The focus for an internal developer platform is improving the release process that teams need to follow to release software.

Let’s think about what a modern team might need to release software in the cloud.

  • A repository to store code
  • Pipelines that will run when code is updated
  • Automated scans that run when the pipeline runs, which might include:
    • Security checks
    • Code quality checks
    • Running tests
    • Creation of ephemeral environments
  • A release process for releasing changes (often referred to as a software delivery cycle or SDLC)

In addition to the release process is the run model; that is, where will the application run and be maintained? This in turn leads to needing infrastructure for running the software. The infrastructure layers to consider include:

  • Networking
    • How is traffic handled?
    • What traffic is allowed?
    • What ports should be accessible?
    • What should be public, what should be private?
  • Compute 
    • Is the team using VM’s, containers, or going Serverless?
  • Storage 
    • Where is the data going to live?
    • How is the data going to remain secure?
    • What and who can access the data?
  • Observability
    • What is happening across your solution?
    • Who is accessing your systems and what changes are being made?
    • How healthy are your systems?
    • How well are your systems performing?
    • What alerts will help your team?
  • Security
    • What access control is in place?
    • What checks and scans for security vulnerabilities? 
       

With on premises there typically would be teams that you hand over to handle these layers, but whilst the shift left to the “you build it, you run it”/DevOps approach has clear advantages in terms of breaking down silos between development and operations, a side effect is that the development team now owns a lot more responsibility for their solution. This responsibility has a high cost of increased cognitive load.

One workaround for this is to hire a variety of individuals with a variety of skills, and thereby have people with the skills you need all in one team. However, this can become costly, and can also lead to large amounts of duplicate effort.

With an internal developer platform, you have a dedicated team that looks at each of these layers, working with the product teams, and creating well-defined solutions for engineering to take and use.

This decreases the cognitive load of the individual engineering teams because they no longer have to solve common problems themselves and instead can focus on the implementation based on existing solutions. In addition, as Monzo’s Suhail Patel noted elsewhere on WTF, increased standardisation as to how services are implemented makes it easier for developers to move between different teams.

To summarise an internal developer platform is an approach of standardising your software delivery process and providing solutions to meet the architectural needs of the product teams, and providing everything needed to release faster whilst having control over their entire end to end solution. This approach decreases the cognitive load of product teams because the complexities around releasing and running the software has been solved ahead of time.

Why would you want an Internal Developer Platform (IDP)?

You might be making a strategic shift from on premises to the cloud. You might already be on the cloud, but finding the move to the cloud has become increasingly challenging.

Some signs that indicate you need an Internal Developer Platform (IDP):

  • You plan on having multiple product teams
  • You are noticing your product teams are experiencing a lot of cognitive load
  • Releasing software on the cloud takes a lot of effort
  • You plan to migrate to the cloud and want to create a golden path for new teams

Where to start when designing an IDP?

What is the business goal?

Before designing your IDP you need to align on what the business goal is. As a starting point, run a workshop with all the stakeholders in a room and make sure everyone aligns on the critical goal.

Is the goal to increase the speed of releasing software? Is the goal to collect and store valuable data? Is the goal to change the lives of your customers?

Having a clearly articulated and aligned goal will drive the focus of the platform and what the teams prioritise.

Design your teams

Conway’s law—“Any organisation that designs a system (defined broadly) will produce a design whose structure is a copy of the organisation's communication structure.”—makes it clear that the starting point when designing an IDP should be team structure and the communication between them. In some cases, this might involve executing an Inverse Conway Manoeuvre:

An organisation should focus on organising team structures to match the architecture they want the system to exhibit rather than expecting teams to follow a mandated architecture design.

James Lewis

There is an interesting point to make here, which is that the people who have the power to design an organisation’s teams are also the people affecting the software. There is a strong correlation between the two, and a big mistake organisations make is to structure teams without understanding how it affects the software in the long run.

How should you structure your teams?

If you haven’t already done so, a good place to start is to read the book Team Topologies, which provides a good foundation as to how to structure software teams, and also features on our recommended reading list. A follow up workbook is useful if you are working with remote teams, and you can listen to our “Hacking the org” podcast with one of the co-authors, Manual Pais, for more information.

However, to give a high-level overview, the book suggests that there are 4 team types:

  • Stream-aligned team
  • Enabling team
  • Complicated-subsystem team
  • Platform team

Then there are 3 interaction modes the teams can use to handle communication:

You could decide on a structure similar to this, but it all depends on your business goals:

In this example, we have the following teams:

Platform teams:

  • Landing Zone - responsible for making sure the cloud is setup for use, this can include governance rules, handling the billing
  • Other Platforms - these can be any self-service type platform that caters to the needs of teams
  • Internal Developer Platform - responsible for standardising your software delivery process and providing solutions to your product teams

Product teams:

  • The individual product teams that will be responsible for producing and releasing software

Enabling teams:

  • SRE team - they are responsible for helping product teams setup pipelines, automate their release flows, and setup the right observability practices within their applications and systems
  • Enterprise Architecture team - they review the product team’s architecture and are responsible for understanding the big picture. They help product teams with designing architectures for their applications and systems in such a way that it will live harmoniously with the rest.

In the beginning, before any platform exists, you will need to create a core team composed of team members from the landing zone, the internal developer platform, and the product team.

From there, start setting up your enabling teams to support the core team, as they will work closely with the core team in the beginning and share the same goal.

The goal with this core team is to build the first prototype together and release it to production.

This approach ensures that everyone is working very closely together, and that the communication flow is as quick as possible. The fundamental goal, to emphasise again, is to get to production.

A mistake made by many companies creating their IDP is that they start too big and do not have a clear aligned goal. They then take a long time to reach production because they are focusing on too much at once.

This core team should know the business goal, and from the business goal decide on and choose a small valuable piece of the business functionality to move to the new platform. This should then be the aligned goal across all the teams.

Keeping the teams focused in this way will make sure they follow a “just enough” approach, where they put in just enough effort to see effective results. No more, no less.

Break it into phases

Creating the IDP will be a phased process:

The IDP will evolve, so this approach is to start building the IDP, and assume it will go through other phases of working and delivering value as it matures. A good mental model of the type of phases you will likely face, is the POC, Day 1 and Day 2 phases.

POC is what has been described already. Building just enough to deliver a focused amount of value to production.

Day 1 will be to split out the core team into their respective teams. Eg Landing Zone Platform team, IDP Platform team, Product team and then, if they are not already seperate teams, the enabling teams (SRE and Enterprise Architecture as an example). Part of Day 1 would also require maturing the services built to support the POC. And then proceed with planning the work of more product teams.

Day 2 is where the focus moves to the run model. Consider the lifespan of the software and how long it will need to be supported and maintained. Also consider the big picture, how are multiple applications going to run alongside each other. This is the phase where mature SRE practices become a bigger focus. Observability across all the applications and services becomes a key strategy. As well as streamlining releases and maintenance tasks with automation.

Drive the right value

It is important to have everyone working on something that results in value, otherwise you will build functionality that is not needed and this will delay getting the right results from the IDP.

Remember the IDP is meant to help product teams release faster. The primary focus on the IDP should be to move obstacles out of the product team’s way. Only once the product team has the ability to release their software autonomously should the platform team look at additional features.

You can follow this 3 step approach for making sure every team is working on a piece of value.

Decide on the value the team needs to produce, for example the product team wants to release a small application showing some key data. From that, list out what the team requires in order to release the small application, eg:

  • Pipelines to automate the release of the code
  • A place to store and retrieve data
  • A place to store code 
  • A way to run containerised applications
  • A secure way to store and use certificates
  • A secure way to store and use secrets
  • etc

Then create the milestones for the team.

Once those 3 steps have been done, the requirements can be passed up to the IDP team.

The IDP team will then do the same 3 steps. The value they need to produce is based on the requirements they got from the product team, eg:

  • Self hosted GitLab runners
  • EKS cluster for product teams to run their containerised applications on
  • Monitoring of activity across the cloud
  • Access controls across the cloud

The IDP team then does the same, that is, they list out their requirements they need in order to be able to solve the product team’s needs, eg:

  • Get billing approval
  • Setup OU policies
  • Setup centre of excellence teams (eg Cloud Security and Risk)
  • Decide on allowed AWS account structures
  • etc

And then they plan out the milestones for creating the Platform. They also pass the requirements they listed up to the landing zone team.

Now the landing zone team can do the same. The landing zone team needs to make sure the IDP team is able to consume the cloud in an approved way in order to build out the IDP. Their focus could include more organisational policy structures and approvals for the cloud, eg:

  • Get billing approval from finance
  • Plan the organisation’s OU policies
  • Decide and get approval for AWS account policies
  • Decide on billing rules
  • etc

This is how everyone works together to create the right valuable functionality across all the teams that enable end to end success.

Imagine if the teams were not aligned on a clear, common goal, and not focusing together to get a small piece of value to production. At any point in the flow mentioned above, a single team can become the blocker for the other teams. This is very common, but it is also avoidable. Teams can very easily fall back into the “throw it over the wall” mindset without realising it. We want to avoid that, by working towards autonomous, transparent teams with clear goals. And not isolated teams that have unclear mismatched goals and turn themselves into silos.

Working this way helps you POC your processes, not just your software.

To summarise

An IDP can offer many benefits to your organisation. The benefits include:

  • Standardising your software delivery process
  • Improving the efficiency of other teams
  • Faster, reliable  software releases
  • End to end control of your software delivery lifecycle
  • Solving complexities within a single team and decreasing redundant effort
  • Decreasing cognitive load

When designing your IDP you should:

  • Have a clear business goal
  • Align all leadership on that goal
  • Structure teams and communication models
  • Create a core team of individuals from each team
  • The enabling teams form and support the core team
  • Identify a small piece of business value as a POC to build
  • All teams focus on getting the POC to production
  • The core team focuses on building only enough functionality for the POC
  • Within the core team, focus team members on their respective piece of value to deliver

As an example:

  • Decide on value (eg an application that shows useful data to an authenticated and authorised user)
  • List the requirements for different parts of the solution, eg:
    • Business requirements to be given to the product team
    • Product team requirements to be given to the IDP team
    • IDP team requirements to be given to the landing zone team
  • Plan out milestones (commit to delivering)
  • Release POC to production

Avoid falling into the “over the wall” mindset and creating silos. Focus on a POC phase and create an IDP with a core team that combines members across all the teams, and agree on clear simple goals early in the process. Set up the right processes to support the teams that will benefit from the IDP. Get the POC to production. Once the POC is in production, split out the teams and proceed to the Day 1 and Day 2 phases.