At Container Solutions we constantly push the boundaries of the tools we work with. While exploring programmable infrastructure we combine the available tools in new ways. Sometimes it works and sometimes it doesn't, but nevertheless we like to contribute things we learned back to the community.
One example of our efforts concerns Terraform. We have done extensive research into Terraform and we have been using it with different platforms, some more exotic than others. When we were applying Terraform to provision bare-metal servers, we found that we were lacking a few networking services in the environment where we were provisioning these servers. When a machine starts up in an unconfigured state, it will try to boot using BOOTP. So you need services like DHCP, a PXE server and eventually a DNS server. Cobbler is a system that bundles these services, so we decided to use that. We quickly found that to handle systems in Cobbler through Terraform, the proper way for Terraform to interface with Cobbler would be through a Provider. No such provider existed yet so we decided to write our own.
Structure of a Provider
My mate Carlos already started out on a series detailing how to write a provider so I won't repeat that stuff here. Instead I'll explain a bit about the specifics of dealing with Cobbler. From Carlos' post it's clear that we need to define a Provider
with three items: a Schema
which is a collection of parameters for configuring the Provider, a ResourcesMap
that lists the resources that will be configurable using this Provider and a ConfigureFunc
that details how to set up the connection. For each of the resources we wil need a Schema
as well, plus methods describing how to Create
, Read
, Update
and Delete
these resources. I could paste some snippets of the code here but you might as well head over to Github to check out the full source code.
Provider
The provider is fairly straightforward, it has a Schema
that contains three strings: a url, a username and a password. For now we have three resources defined in the ResourcesMap
: a system, a kickstart file and a snippet.
The ConfigureFunc
for the provider returns an HTTP client that talks to the Cobbler server. We abstracted the code that actually interfaces with Cobbler into a separate client library, to make the Provider code cleaner. We just pass the values for the url, username and password that we get from the .tf configuration file provided by the user, as arguments to a new Client
object.
System Resource
Next we define the System Resource. This has a slightly more elaborate Schema
which contains a map of network interfaces. However, most fields in the Schema
are just strings so it's not really special either. The interesting stuff here is in the Create
method. Cobbler expects that you perform a series of requests to create a resource (a System
in this case) and modify it's properties, and then after you're done you send a sync request which commits your changes. This involves updating the configuration files and restarting the various services under Cobblers control, e.g. the DHCP server. The good thing is that this encourages you to set up a bunch of changes and commit them all at once. In fact, if you call the sync
after each change, Cobbler will try to restart the services repeatedly and it breaks very quickly.
The unfortunate thing is that Terraform doesn't accomodate this behaviour, i.e. there is no way to run a post-execution hook or something. So we decided to call the sync
in a goroutine. The first Create
thread will call the goroutine and wait for a signal from the goroutine over a channel. We set a timeout when creating the resource, and perform the sync
only after that timeout has passed. Each call to the Create
method will reset the timeout to one second in the future, and so the timeout will pass one second after the last Create
has been executed. The channel is used to make sure the calling thread will wait on the goroutine, else the Terraform run would end before the sync
actually had been called. The timeout of one second is a bit arbitrary, but it seems to do the job.
To do
We haven't implemented the Read
and Update
methods yet, they just return nil for now because we thought they were not important for our immediate use case. While diving further into Terraform provider internals we came to understand the need for a Read
method, which is to sync the current state of the infrastructure (Cobbler) with Terraforms internal state as reflected in the state file. So this is definitely on our list of things to implement.
Another thing we need to change is the way we handle logging in to Cobbler. We currently login at the beginning of the Create
and Delete
methods, which causes new logins for every resource created or deleted. We should move the login to the ProviderFunc
, so the login happens only once and the token that is returned will be reused throughout the Terraform run.
Also we currently don't implement the sync
method after deleting resources, which we really should because the DHCP server also needs to be notified of resources that have been removed.
Other Resources
The code for the kickstart and snippet resources is quite simple really. They just take a path to a textfile on the local system and post that to Cobbler using the specified name.
Wrapping up
As you can see there's still a lot missing but basically it works. We can create a system in Cobbler using Terraform and have our bare-metal server boot over the network having it's boot image served by Cobbler. The exciting thing is that our efforts did not go unnoticed by the Hashicorp folks, who kindly proposed to merge the code into the Terraform tree. We're currently in the process of getting our PR approved. As you can probably guess we're very pleased to get a chance to give something back to the community, which after all is one of Arnold Schwarzenegger's 6 rules of success.