This blog post is part of a two-part series.
In this blog post I’d like to provide an overview of what unikernels are, how they fit in the cloud computing landscape and what projects are driving the technology behind them. In Part 2, we’ll also build and run MirageOS and Rumprun unikernels and hopefully give a positive answer to the question: ‘How can I run my existing application as a unikernel today?’
A Little OS History
Before diving into unikernels, let's take a very quick detour through operating systems history and the popular tools that make up the cloud computing landscape today.
In the ‘50s and ‘60s, computers were expensive and idle time had to be kept to a minimum. As computers were capable of running only one program at a time, batch scheduling was employed to use compute time efficiently. One problem with this approach is the lack of instant feedback. To address this, timesharing was invented, giving users the illusion of having the whole system to themselves.
The timesharing operating system juggles between users and programs, dealing with moving programs in and out of memory, scheduling tasks fairly, and providing users with private, permanent storage—all concepts we still find in operating systems today. What is very different now is that instead of isolating users from other users, the common use case is isolating users from themselves.
We are no longer the sole owner and author of the code we're running. The privileged access to hardware is also questionable in an age when everybody owns their own hardware (either physical or virtual). In the context of cloud computing, machines are provisioned to serve a single goal (such as a web service, a database). Before LXC and Docker, the potential for resource waste with single-purpose applications running in a full-blown operating system was very high.
The bloat has been mitigated by containers and orchestration systems like Kubernetes. Virtual machines are provisioned to run multiple applications in containers, which are sharing the kernel of the host to speed up the management lifecycle of the applications. This greatly improves the effectiveness of resource usage, but it puts a large burden on the kernel syscall API that the containers use to interact with the host OS.
This API is very broad, as it offers kernel support for process and thread management, memory, network, filesystems, etc. The syscall API is more difficult to secure than the simpler x86 ABI offered by VMs.
This is where unikernels can provide value by further reducing resource usage and providing a stronger security guarantee.
What Is a Unikernel?
Let’s consult unikernel.org: ‘Unikernels are specialised, single-address-space machine images constructed by using library operating systems’.
This definition covers the minimalism of the single-purpose operating system and the ideal of the library operating system, which provides the developer with components to pick from to build the minimal hardware interfacing layer. A unikernel introduces a shift from runtime configuration to compile time configuration. It literally employs the functions needed to make an application work, and nothing more.
Because of the drastic reduction in dependencies, unikernels are also very quick to start, making them viable to use them as on-demand services. The specialised image means that the configuration is also baked in the building process, moving the focus from deploying and configuring systems to deploying and configuring software. The system in the context of unikernels is just a library.
The best analogy to developing unikernels is developing for embedded systems: you would develop on a fully-capable machine, with the luxury of complex debugging tools and you would build the tiniest artifact possible for production.
There are limitations which allow for the minimality benefits to happen:
- Single process (but multiple threads)
There is significant overhead to add process management. There has to be a way to start/stop/inspect a process, ensure inter process communication etc.
- Single user
Multiple users require authorisation and authentication, resource isolation, etc. These aren’t necessary in a single purpose application.
- Limited debugging
In their production form, unikernels have very limited debugging capabilities. Failures need to be reproduced on a development platform.
- Limited library ecosystem
Functionalities found in existing general-purpose operating systems might be missing, and keeping in sync with the features offered by, for instance, the Linux Kernel might be an intractable task.
The first three points can be very digestible when your goal is building the smallest possible single-purpose application. But if you’re relying on multiple functionalities (offered by, for instance, the Linux Kernel), the last point can be a showstopper.
There are already a large number of unikernel projects. They can be split in two major approaches.
- Clean slate: With the assumption of building a single-purpose OS comes the freedom to use modern tools to build it: modularity, declarative code, avoiding boilerplate. The operating system and application layer is re-thought from scratch, with a high-level programming language that makes writing high-quality system libraries a tractable task.
- Legacy: The goal is to run existing software unmodified or with minor changes. This is usually achieved by refactoring an existing operating-system codebase into a library operating system.
We will take a closer look at MirageOS for the clean-slate approach and at Rumprun for the legacy approach in Part 2 of this series. However, there are multiple projects supporting the unikernel ecosystem or building their own unikernel that are worth mentioning before going into details about any specific project.
MiniOS is a tiny OS kernel distributed with the Xen Hypervisor. It is used as a basis for the development of Unikernels. Examples include ClickOS and Rumprun.
Solo5 is meant to be an interface platform between a unikernel and the hypervisor. Unlike MiniOS, the target hypervisor is KVM/QEMU rather than Xen Project. Where Xen Project leverages paravirtualization to allow the unikernel to talk to the hypervisor, Solo5 contains a hardware abstraction layer to enable the hardware virtualisation used by its target hypervisors.
The Haskell Lightweight Virtual Machine, or HaLVM, is a port of the Glasgow Haskell Compiler toolsuite to enable developers to write high-level, lightweight virtual machines that can run directly on the Xen hypervisor.
ClickOS, a high-performance, virtualised software middlebox platform is a unikernel specialised for Network Function Virtualisation. ClickOS virtual machines are small (5MB), boot quickly (about 30 milliseconds), add little delay (45 microseconds), and over 100 of them can be concurrently run while saturating a 10Gb pipe on a commodity server.
IncludeOS is an operating-system library for building unikernels, written in C++. It can take advantage of multiple CPUs and threads can be used to distribute workload onto multiple CPU cores. It also maintains a limited source-code compatibility with Linux.
Next: In Part 2 of this blog series, Mircea will examine two unikernel projects, MirageOS and Rumprun.