This blog post is the conclusion of a series.
In Part 1 of this blog series about unikernels, I explained what unikernels are, and their role in reducing resource usage within operating systems and making them more secure. I also discussed some of the projects that support the unikernel ecosystem. In this conclusion of the series, I’ll take a look at two unikernel projects: MirageOS, which follows the clean-slate approach, and Rumprun, which takes the legacy approach.
As a refresher, here’s the difference between the two approaches.
Now, let’s get on with our deep dive into two unikernel projects, MirageOS and Rumprun.
MirageOS, which takes a clean-slate approach, is one of the oldest and most established unikernel efforts. It is based around the OCaml language, with libraries that map directly to operating-systems constructs when being compiled for production deployment. MirageOS includes functional implementations of protocols ranging from TCP/IP, DNS, SSH, Openflow, HTTP, XMPP and Xen inter-VM transports.
It works by treating the Xen hypervisor as a stable hardware platform, allowing the project to focus on high-performance protocol implementations without worrying about having to support all the device drivers found in a traditional OS.
Because of the modularity and code generation features in OCaml, it makes it easy to target both the development platform (unix) and the target platform (a modern hypervisor, such as Xen, KVM).
To give you a taste of developing unikernels with MirageOS, we will go through the `hello-world` unikernel in this repository. The repository holds multiple unikernel examples, which can be picked up and adjusted to your needs (such as, a TLS enabled static website server). We will not go through details of OCaml, but an up-to-date and very reliable resource for learning the language can be found here.
In the tutorial/hello folder you will find two files: unikernel.ml and config.ml. The former contains the unikernel functionality and the latter contains the configuration.
(* unikernel.ml *)
open Lwt.Infix
module Hello (Time : Mirage_time_lwt.S) = struct
let start _time =
let rec loop = function
| 0 -> Lwt.return_unit
| n ->
Logs.info (fun f -> f "hello");
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
loop (n-1)
in
loop 4
end
Exploring the Unikernel module we can see that it’s a module which takes a Time type as a functor parameter. A pre-compilation step generates a platform-specific concrete implementation, depending on the platform you’re building against (for example, unix). The unikernel module needs to implement a start function, which in this case logs “hello” four times every second before it exits.
(* config.ml *)
open Mirage
let main =
foreign
~packages:[package "duration"]
"Unikernel.Hello" (time @-> job)
let () =
register "hello" [main $ default_time]
The configuration file is an OCaml module that calls `register`
to create one or more jobs, each of which represents a process with a start/stop lifecycle.
The call to the `foreign`
function specifies that the unikernel entrypoint is the `Hello`
module in the unikernel.ml file and that the `duration`
package is additionally required for building the unikernel.
When the unikernel is run, it starts like a conventional OS when run as a virtual machine and so it must be passed references to devices such as the console, network interfaces, and block devices at startup. In this particular case, the unikernel requires a time source that is platform dependent. This will be provided during the configuration step, when the target platform is specified.
The unikernel can be built against a UNIX target, for development purposes.
> mirage configure -t unix
> make depend && make
> ./main.native
And then, when targeting production (such as KVM), the only difference is the configure step. Running the artefact depends on the target platform. For example, to run it with solo5 hardware virtualised tender, you would do:
> mirage configure -t hvt
> make depend && make
> solo5-hvt hello.hvt
It’s also relatively easy to deploy such an image on an instance in the cloud (like GCP).
> mirage configure -t virtio
> make depend && make
> solo5-mkimage -f tar unikernel.tar.gz unikernel.virtio
> gsutil cp unikernel.tar.gz gs://my_storage_location
> gcloud compute images create unikernel-vxx --source-uri gs://my_storage_location/unikernel.tar.gz
> gcloud compute instances create --image unikernel-vxx --machine-type=f1-micro this-particular-instance
There are more than 50 supporting libraries for MirageOS, ranging from distributed databases to a DNS protocol implementation. They support the goal of building clean-slate unikernels that can run on a wide array of virtualisation platforms: Xen, KVM, even Qubes OS.
Rumprun is an implementation of a rump kernel and can be used to transform just about any POSIX-compliant program into a working unikernel. With Rumprun, it is theoretically possible to compile most of the programs found on a Linux or Unix-like operating system as unikernels.
The concept of rump kernels comes from the world of NetBSD. Unlike most operating systems, NetBSD is specifically designed to be ported to as many hardware platforms as possible. Thus, its architecture was always intended to be highly modular, so drivers could be easily exchanged and recombined to meet the needs of any target platform. The Rump Kernel project provides the modular drivers from NetBSD in a form that can be used to construct lightweight, special-purpose virtual machines.
Rumprun started from the need of developing drivers in the NetBSD kernel and testing them in userspace. The main effort was refactoring this codebase to look like a library operating system.
In the next sections we will go through the steps of running an application as a rump kernel.
(The following steps have been tested on Ubuntu-16.04, as newer releases come with a newer version of GCC which fails the Rumprun platform build step.)
Rump kernels are always cross-compiled. It’s very likely that if an application supports cross-compilation, it will be runnable as a rump kernel. There is no compiler distributed with Rumprun. The framework tries to make use of whatever compiler is available on the build host. The target platform for the unikernel can be one of `xen` or `hw` (`xen` for the Xen hypervisor and `hw` for anything else). Depending on your host and target environments, the platform will wrap the appropriate compiler and return the wrapper as output of building the platform:
> git clone http://repo.rumpkernel.org/rumprun
> cd rumprun
> git submodule update --init
> CC=cc ./build-rr.sh hw
[...]
>> toolchain tuple: x86_64-rumprun-netbsd
>> cc wrapper: x86_64-rumprun-netbsd-gcc
>> installed to "/root/rumprun/./rumprun"
>>
>> Set tooldir to front of $PATH (bourne-style shells)
. "/root/rumprun/./obj-amd64-hw/config-PATH.sh"
>>
>> ./build-rr.sh ran successfully
In this case, the build process produces `x86_64-rumprun-netbsd` as the toolchain tuple and `x86_64-rumprun-netbsd-gcc` as the compiler wrapper. One final step is adding the platform binaries to the path.
> export PATH="${PATH}:$(pwd)/rumprun/bin"
We’re going to build and run a “Hello World!” C application as a unikernel.
/* hello.c */
#include
#include
int main()
{
printf("Hello!\n");
sleep(3);
printf("Goodbye!\n");
return 0;
}
Compiling the application is straightforward; the only difference from a usual compilation is that we’re using the wrapped compiler.
> x86_64-rumprun-netbsd-gcc -o hello-rumprun hello.c
The fundamental difference comes when running the application, as rump kernels need to be ‘baked’ into a bootable image. This ability is provided by the `rumprun-bake` binary, which takes a component configuration (exactly what the rump kernel needs to run: e.g., the network driver), the output binary, and the input binary. In this case we will use the `hw_generic` component configuration.
> rumprun-bake hw_generic hello-rumprun.bin hello-rumprun
The `rumprun` tool can now be used to run the unikernel.
> rumprun qemu -i -g '-curses' hello-rumprun.bin
The `-i`
flag is required for observing the console output produced by the application and the `-g` flag passes options to `qemu`
(`-curses`
is useful in case you are testing in a headless VM).
An impressive repository of software running on the Rumprun unikernel is available here and most of the packages are provided with no or minimal modifications to allow cross compiling. This makes it possible to run applications as complex as a RAMP stack which employs NGINX, MySQL, and PHP each built on Rumprun.
Coming from the Docker/Kubernetes ecosystem, it might be disheartening to see the difference in ease of use when trying to run an application as a container as opposed to running it as a unikernel. There are a lot of opportunities for the status quo to change in the near future. This chapter aims to answer the question, ‘If I have an application suited to be run as a unikernel, what is the most straightforward way to do it?’
One convenient way to do it is using the NanoVMs platform that is relying on a new kernel (nanos) and an orchestrator (ops) for running and deploying applications as unikernels. Once you go through the getting started guide, running a Go application, for example, is as simple as running:
> go build main.go
> ops run main
The orchestrator also includes support for deploying applications as unikernels on the public cloud, with bindings for Google Cloud and Amazon Web Services.
In this blog post, we focused on projects implementing a library operating system approach to unikernels, where orthogonal components can be selected at development/build time to create a minimal bootable image to run an application distributed with all its dependencies and nothing else. Related technologies that attack the same problem of sandboxing and minimizing resource usage do not commit to the same level of isolation, in favor of accessibility. Here are some of them.
Firecracker is a Virtual Machine Monitor (VMM) open sourced by Amazon and used to enable services like AWS Lambda and AWS Fargate. It is an alternative to QEMU that is built for running serverless functions and containers safely and efficiently. Firecracker is written in Rust, provides a minimal required device model to the guest operating system while excluding non-essential functionality.
There’s an accompanying orchestrator effort to make it as easy to use as Docker: Ignite. Ignite makes Firecracker easy to use by adopting its developer experience from containers. You can pick an OCI-compliant image that you want to run as a VM, and then just execute `ignite run`
instead of `docker run`
.
gVisor is a user-space kernel, written in Go, that implements a substantial portion of the Linux system surface. It includes an OCI runtime called `runsc`
that provides an isolation boundary between the application and the host kernel. The runtime integrates with Docker and Kubernetes, making it simple to run sandboxed containers. It limits the host kernel surface accessible to the application while still giving the application access to all the features it expects.
Nabla containers are a new type of container which use unikernel techniques, specifically those from the Solo5 project, to avoid system calls and thereby reduce the attack surface. The project comes with a very promising OCI container runtime, which means running a workload on Kubernetes should be feasible.
Unikernels are very useful in a resource constrained environment. They offer a constructive solution to building minimal applications by offering all the orthogonal pieces needed to build them. You might not want to run your complex web application as a unikernel, unless you’re adopting a whole orchestration solution for unikernels (something like Albatross).
Containers don’t seem to be going away, especially as orchestration engines like Kubernetes and OpenShift continue to rise in popularity. There are also efforts for securing container workloads and minimising the operating-system interface, not unlike the minimal unikernel interface. But the effort is going at it from the opposite angle, removing all layers of cruft, instead of building from scratch with exactly the needed pieces.
Photo by Jude Beck on Unsplash