Lately I've been getting my hands dirty deploying applications on Mesos clusters, using Marathon to run Docker containers. I appreciate how it enables you to deploy a wide variety of applications using very little configuration, and seeing your applications scale up and down in a matter of seconds is really neat, but I have had a hard time trying to make an application available to an end user. Hosting your web site on a Mesos cluster, which seems like a basic use case to me, appears to be rather involved, especially when you take into account the infrastructure specifics of your cloud provider. Surely you can get a container running in no time, but then routing a request to it and getting a response back is no trivial matter. In this post I will show how to deploy a simple container with a web server on a Mesos cluster running in the Google Cloud.
If you want to follow along, you will need the following:
Once you have the basics set up, visit https://google.mesosphere.com and follow the three simple steps to create a cluster. Just choose a development cluster, enter your public SSH key and your Google Project ID. Review the settings and click Launch, and in a few minutes you'll have a shiny new Mesos cluster, with both the Marathon and Chronos frameworks available for scheduling your tasks. We'll be using Marathon to deploy our web application in a container.
Next we'll need to create a configuration for your application. For the purpose of this example I'm using a simple Nginx application. Surely there are more interesting applications to deploy, but I just want to demonstrate the process of running a webserver so a static web page will suffice. You can specify all kinds of things in the configuration, consult the Documentation for an overview of the possibilities. Below are the contents of my nginx.json configuration file:
{
"id": "nginx",
"container": {
"docker": {
"image": "library/nginx:1.7.9",
"network": "BRIDGE",
"portMappings": [
{ "containerPort": 80, "servicePort": 80}
]
}
},
"cpus": 0.2,
"mem": 256.0,
"instances": 1
}
The container will expose port 80 (containerPort), and we'll use this as the port for our service as well. This servicePort can be used to link different applications running on the cluster to each other. The Mesosphere setup on Google Cloud comes with HAProxy and the haproxy-marathon-bridge script installed on every node (master and slaves). The bridge script will consult the Marathon API every minute and create an HAProxy configuration with pools of servers listening on the servicePorts specified in the application configuration, and then reload HAProxy if needed. The benefit is that any application running on the cluster can find any other application by using the servicePort on it's own host. This host is set as an environment variable on startup of the container so there is no need for any service discovery. However, you'll need to tell your application about the servicePorts of other applications, for instance through environment variables that you can add to the configuration file. The nice thing is that you can also make use of the HAProxy to let outside users connect to your application running anywhere on the cluster, as we'll see later.
Before we can post the configuration file to the Marathon server, we'll have to set up a VPN connection. On the overview page of your cluster, at https://google.mesosphere.com/clusters/<your cluster name> you can download the configuration for the connection. Save the file on your computer, open a terminal and execute:
sudo openvpn --config /path/to/your.ovpn
If all is well you will now be able to visit the Marathon and Mesos consoles listed on the same page. These are only exposed on port 8080 and 5050 respectively of the internal network interface of the Mesos master node. You will need the IP address of this interface to talk to Marathon on port 8080 in the next step.
Now, open another terminal and post the JSON config to Marathon:
curl -XPOST -d @nginx.json -H "Content-Type: application/json" http://<marathon ip>:8080/v2/apps
The response will be like below. I piped it through jq for better readability:
{
"id": "/nginx",
"cmd": null,
"args": null,
"user": null,
"env": {},
"instances": 1,
"cpus": 0.2,
"mem": 256,
"disk": 0,
"executor": "",
"constraints": [],
"uris": [],
"storeUrls": [],
"ports": [
0
],
"requirePorts": false,
"backoffSeconds": 1,
"backoffFactor": 1.15,
"maxLaunchDelaySeconds": 3600,
"container": {
"type": "DOCKER",
"volumes": [],
"docker": {
"image": "library/nginx:1.7.9",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 80,
"hostPort": 0,
"servicePort": 80,
"protocol": "tcp"
}
],
"privileged": false,
"parameters": []
}
},
"healthChecks": [],
"dependencies": [],
"upgradeStrategy": {
"minimumHealthCapacity": 1,
"maximumOverCapacity": 1
},
"labels": {},
"version": "2015-02-17T19:32:03.648Z"
}
You can check in either the Marathon or Mesos consoles to see your application being deployed and running. Congratulations! You've just launched your first Docker container on a Mesos cluster! You can check that it's really working by visiting the private ip address of any of the Mesos nodes (master or slaves) in your browser. You might need to wait a minute for the cronjob to fire the bridge script and update the HAProxy configuration but then you should see the default Nginx page. This only works over VPN though, so it's not what we want.
So now that we have our application running, how do we make it accessible to the outside world? In the case of Google's Cloud we'll need to take care of a few things. We'll need to create a forwarding rule, open a port in the firewall and configure iptables on the Mesos master and slaves. We probably could get away with just opening up port 80 on the master node, but this would leave us with a single point of failure. In case the master node is down or unreachable, our web application would not be available. What we would like is to have a fixed external address to connect to, so we can for instance point to it with our DNS. Connections to this address on port 80 should be forwarded to a target pool of servers, in which we put all our nodes. Then we open up port 80 in the firewall so we can access the application in our browser. So let's get to it!
First, let's create an external address to connect to. In your terminal, issue the following:
gcloud compute addresses create myipaddress --region <cluster region>
If you do not specify a region you will be asked for one. Be sure to pick the same region your cluster is running in. You can find this on the cluster overview page, but in my case it lists the zone (us-central1-a) instead of the region (us-central1). You will get a response like:
(out) Created [https://www.googleapis.com/compute/v1/projects/<myproject>/regions/<cluster region>/addresses/myipaddress].
(out) NAME REGION ADDRESS STATUS
(out) myipaddress <region> <ip address> RESERVED
Take note of the ip address because we'll need it shortly. Next we need to create a target pool to forward the traffic to:
gcloud compute target-pools create mytargetpool --region <cluster region>
Add instances to the pool, using the names of the nodes listed on the cluster overview page:
gcloud compute target-pools add-instances mytargetpool \
(out) --instances <master node> <slave node 1> <slave node 2> <slave node 3> --zone <cluster zone>
And then create a forwarding rule to forward connections on port 80 on the external address to the target pool:
gcloud compute forwarding-rules create myforwardingrule --address <ip address> --port-range 80 \
(out) --project <myproject> --region <cluster region> --target-pool mytargetpool
When this is done, we need to open up the firewall. We'll tag the nodes and create a firewall rule to allow connections on port 80 to the servers with the corresponding target tags.
gcloud compute instances add-tags <master node> --tags http-server --zone <cluster zone>
gcloud compute instances add-tags <slave node 1> --tags http-server --zone <cluster zone>
gcloud compute instances add-tags <slave node 2> --tags http-server --zone <cluster zone>
gcloud compute instances add-tags <slave node 3> --tags http-server --zone <cluster zone>
In order to create the firewall rule you need to specify the network that the cluster is running on. First let's list the networks in our project:
gcloud compute networks list
That should list all the networks in your project. If you are still unsure which one to pick, check the description of any of the nodes:
gcloud compute instances describe <master node> --zone <cluster zone> | grep networks
The output will be something like:
network: https://www.googleapis.com/compute/v1/projects/myproject/global/networks/<cluster network>
Now we can create the firewall rule:
gcloud compute firewall-rules create httprule --allow tcp:80 \
(out) --network <cluster network> --target-tags http-server
That should be it, but there's another catch which took me a while to figure out. It turns out that on the nodes there's iptables running which blocks the connections. To fix this, we need to ssh into every node and open up the port in iptables. Again, you can find the SSH user and the external ip addresses of the nodes on the cluster overview page.
ssh jclouds@<master ip-address>
Once you're in, add two rules to the iptables:
sudo iptables -A INPUT -p tcp -m state --state NEW,ESTABLISHED -m tcp --dport 80 -j ACCEPT
sudo iptables -A OUTPUT -p tcp -m tcp --sport 80 -m state --state ESTABLISHED -j ACCEPT
If you would like to save these rules in case you need to reboot the server, do the following:
sudo sh -c 'iptables-save > /etc/iptables/rules.v4'
Don't forget to repeat these steps for the other nodes in the cluster. You probably could (and should) lock it down further to prevent users from accessing your Mesos nodes directly, by only allowing traffic from the external ip address. I haven't looked into that though so I don't know what source and destination you would need to specify.
When you're done, it's finally time to visit the site in your browser. Go to the address you created in the 'gcloud compute addresses create' step above. You should see the Nginx default page:
So that's all there is to it. Of course there's loads more involved in running a proper web application like scaling, persistence, container linking and such. I just wanted to demonstrate how to run a web application on Mesos and actually show a web page to your audience.