(Our CTO John Kinsella originally wrote this article for Linux.com)

When I first started considering container use in production environments, a question came to mind: When a container may exist across a cluster of servers, how do I get others (people, or applications) to connect to it reliably, no matter where it is in the cluster?

Of course, this problem existed to a degree in the olden days of virtual (or not) machines as well. Back in the Old Days of three-tier webapp stacks, this was handled gracefully by:

  • Load balancers had hard-coded IP addresses for the web servers they are load balancing across
  • Web servers had hard-coded IPs of the application servers they used for application logic
  • Application servers had bespoke, hand-crafted definitions of the databases they queried for data to provide back to the app servers

This was “simple” as long as web, application, or database servers weren’t replaced. If and when they were, the “smarter” of us ensured the new system used the IP address of its predecessor. (See? Simple! Who needs DevOps?) (Bonus points to those who used internal DNS instead of IPs.)

Solutions arose in time. Zookeeper comes to mind, but was far from alone. Service discovery is getting more attention now as complexity has increased: Where before there might have been 10 VMs, there may now be now 200-300 containers, and their lifecycles are significantly shorter than that of a VM.

Following Docker’s “batteries included, but can be replaced” philosophy, Docker Swarm mode comes with a built-in DNS server. This provides users with simple service discovery; if at some point their needs surpass the design goals of the DNS server, they can use third-party discovery services, instead (covered in our next blog post!).

Getting started

There’s plenty of resources available on the Internet discussing installing Docker Swarm mode, so that won’t be repeated here. For this post, I’ll be using a Vagrant configuration that I have forked on GitHub and added some port forwarding to for this post. If you have Vagrant and Virtualbox installed, bringing up a docker swarm mode cluster is as easy as:

After the last command, take a break to stretch your legs – usually it takes 5-10 minutes for Vagrant to download the Ubuntu VM image, bring up 3 VMs, update packages on each, install Docker and join the VMs to a swarm mode cluster. Once completed, you should be able to ssh into the master node and list members of the swarm with:

(IDs will be different as they are fairly unique)

Launch a WordPress cluster

Next, let’s launch a WordPress cluster of two WordPress containers backed by a MariaDB database. To make this easy, I’ve created another GitHub project containing a docker-compose file to build the cluster. Let’s clone the project and bring up the containers:

In the background, Docker is scheduling those containers to run across the swarm, downloading images, and spinning up the containers. Depending on your computer and network speeds, after about a minute you should be able to see the services running:

At this point, you should be able to load http://localhost:8080 in a browser and see the initial WordPress configuration screen.

Take a look at the docker-stack.yml file. You’ll see environment variables passed to the WordPress containers instructing them to connect to a MariaDB database with a hostname of dbcluster – the name listed for the database service. It’s just a string passed into the container, there’s no defined link between the two services. In older versions of this demo, we would have had to create a “link” between the wordpress and dbcluster services in the docker-stack.yml file in order for the wordpress containers to be able to recognize and use the dbcluster hostname. This would have looked like:

Instead, what’s happening here is after Docker creates the dbcluster container, it automatically publishes an A record in its DNS service to allow other containers to find it when they perform a DNS name lookup. If you look at the beginning logs of one of the WordPress containers, you may be able to see at first it’s unable to find the dbcluster host, then after a few tries it gets it. Then the connection is refused while mariadb is starting up. WordPress keeps attempting to establish a database connection, and once the db is up, it connects and we’re up and running:

Scaling the Database

Next, let’s try scaling up the database and see what happens. The database container image I picked to use here has been built with MariaDB’s Galera clustering enabled, configured to discover cluster members via multicast. While DNS-based service discovery is built into Docker Swarm Mode, the more complex process of multi-master replication is still left for an application to figure out, thus the Galera clustering functionality is required. Let’s scale the database cluster up to three nodes:

Docker spins up two more containers and adds them to a load-balanced pool behind the dbcluster virtual IP address. The containers start, discover each other via multicast, sync up and once you see the message below in their logs (after about 30 seconds), we have a three-node db cluster:

Try loading the WordPress site again in your browser – it should still work! At this point, when the WordPress containers attempt to connect to dbcluster, the request is load-balanced by Docker across the three dbcluster containers. The IP address for dbcluster which is published in Docker’s DNS is a “virtual IP,” and behind the scenes Docker load balances traffic to the cluster members using IPVS. If multi-master replication was not synchronized, this would cause significant confusion, if it worked at all.

Finally, let’s scale the database cluster back to a single node. While I’d want to be very, very certain of what I was doing (and the state of my backups) before trying this in production, for this demo we can be carefree and try:

With that, two containers will be gracefully shut down, the mariadb cluster returns to a size of one, and WordPress should still be running happily.

Docker Swarm mode service discovery works quite well, and helps us to loosely define relationships between parts of an application. There’s limitations to what it can do, though – we’ll cover those in future posts.