Interview with David Strauss of Pantheon
How containers will shape the Drupal ecosystem
I recently had the opportunity to interview David Strauss about how Pantheon uses containers to isolate many Drupal applications from development to production environments. His upcoming DrupalCon talk, PHP Containers at Scale: 5K Containers per Server, will give us an idea of the techniques for defining and configuring containers to get the most out of our infrastructure resources.
Having recently dove into the container realm myself, I wanted to learn from the experts about the challenges of managing containers in a production environment. Running millions of production containers related to Drupal, David is certainly an expert resource to ask about this subject. I look forward to learning more details at DrupalCon!
Containers have seen a rapid increase in popularity in the past few years. How do you see this specifically changing the Drupal ecosystem?
I see a few major effects from containers in the Drupal ecosystem:
1. Consistent, rich stacks: It used to be that for any specific mix of daemons, you'd have to wire them together yourself. Generic discovery never really worked for server stacks because you would deploy services first and then bolt on discovery, which then had to handle myriad inconsistencies before even touching authentication/authorization challenges. Containers and orchestration tools reverse the model. They start with a consistent discoverability, authentication, and authorization model and then require daemons to deploy within those constraints. This means it's now possible to assemble rich stacks including tools like Varnish, Solr, ElasticSearch, Beanstalk, and Redis and integrate those with the application without touching IPs, ports, and passwords. With more access to offload from PHP+MySQL to best-in-class daemons, the whole Drupal community can level-up way beyond LAMP.
2. Lower barriers to HA (High Availability): With containers, it's easy to deploy application server daemon P to hardware hosts A and B alongside separate application server Q on the same hosts–all on one OS image. While you could have used VMs before, the overhead of an OS is huge compared to a moderate-sized PHP-FPM pool. This OS overhead made the VM approach half as efficient as containers in some cases, which kept the cost of HA high. Container orchestration tools also automate fail-over in ways that required manual setup/intervention or network layer three cleverness before.
3. Lower cost: Hosting and development platforms can run more efficiently, re-balance load more rapidly, and provision efficiently on bare-metal hardware, lowering the cost that gets passed onto website owners/developers and the environmental footprint.
What does the orchestration look like for running so many containers at once? What are some of the pain points for managing 5000 containers on a server?
We handle orchestration through a Cassandra, Python, Chef, Consul, and systemd stack that identifies which containers should go where and provisions them. We deploy containers stopped, shut down idle containers, and (nearly) instantly reactivate them on demand using socket activation.
Communication between containers generally happens over mutually authenticated TLS. To each container, we issue an x.509 certificate that identifies the container's UUID as well as the website and environment (dev, test, live, etc.) it belongs to.
What problems are you looking to tackle next when it comes to container management?
One of our current projects is "freezering," which is the ability to fully serialize a container (whether in a "hydrated" state like it exists on-disk or a "dehydrated," exported state like mysqldump produces). We're looking at container image formats like ones supported by Rocket.
How do you see the "container revolution" affected by open source development?
I don't see much happening with containers that's not open source!
What advice would you give to a company or new Drupal site looking to overhaul their development and release process?
There's no clear winner yet for orchestration, container images, or, especially, production deployment. In coming years, I expect a shakeout–or at least multiple, compatible implementations.
Docker is great for wiring together stacks for development, but it continues to be weak on the production deployment, density, and security side. CoreOS has rich services for production deployment and a good security model, but it has a higher barrier to entry as a complete distribution rather than an add-on to, say, CentOS. Project Atomic is also worth watching, as it competes with CoreOS but builds on the more familiar Red Hat-style foundation. systemd has a surprisingly rich set of container-management tools and is increasingly the standard init tool on distributions, making it the most out-of-the-box option for traditional distributions. (Disclaimer: I am a systemd developer.) libvirt-lxc has died, and OpenShift is dropping its homegrown container tools; that's some early evidence of a shakeout.
So, my advice is to experiment, maybe use containers for development environment, but wait for deployment unless you're ready to rebuild from scratch in a year or two.
This article is part of the Speaker Interview Series for DrupalCon 2015. DrupalCon 2015 brings together thousands of people from across the globe who use, develop, design, and support the Drupal platform. It takes place in Los Angeles, California on May 11 - 15, 2015.