This article is based on a talk I gave at DockerCon this year. It will discuss Docker container security, where we are currently, and where we are headed.
This is part of a series on Docker security, read part two.
Containers do not contain
I hear and read about a lot of people assuming that Docker containers actually sandbox applications—meaning they can run random applications on their system as root with Docker. They believe Docker containers will actually protect their host system.
- I have heard people say Docker containers are as secure as running processes in separate VMs/KVM.
- I know people are downloading random Docker images and then launching them on their host.
- I have even seen PaaS servers (not OpenShift, yet) allowing users to upload their own images to run on a multi-tenant system.
- I have a co-worker who said: "Docker is about running random code downloaded from the Internet and running it as root."
"Will you walk into my parlour?," said the Spider to the Fly.
Stop assuming that Docker and the Linux kernel protect you from malware.
Do you care?
If you are not running Docker on a multi-tenant system, and you are using good security practices for the services running within a container, you probably do not need to worry. Just assume that privileged processes running within the container are the same as privileged processes running outside of the container.
Some people make the mistake of thinking of containers as a better and faster way of of running virtual machines. From a security point of view, containers are much weaker, which I will cover later in this article.
If you believe as I do, Docker containers should be treated as "container services"—meaning treated as containers running Apache the same way you would treat the Apache service running on your system., this means you would do the following:
- Drop privileges as quickly as possible
- Run your services as non-root whenever possible
- Treat root within a container as if it is root outside of the container
Currently we are telling people in Common Criteria to treat privileged processes within a container with the same criteria as privileged processes running outside the container.
Don't run random Docker images on your system. In a lot of ways I see the Docker container revolution as similar to the Linux revolution around 1999. At that time, when an administrator heard about a new cool Linux service, they would:
- Search the Internet for a package at places like rpmfind.net or just random websites
- Download the program onto their system
- Install if via RPM or make install
- Run it with privilege
What could go wrong?
Two weeks later the administrator hears about a zlib vulnerability and has to figure out if, while hoping and praying that it's not, their software is vulnerable!
This is where Red Hat distributions and a few other trusted parties have stepped in to save the day. Red Hat Enterprise Linux give administrators:
- A trusted repository they can download software from
- Security Updates to fix vulnerabilities
- A security response team to find and manage vulnerabilities
- A team of engineers to manage/maintain packages and work on security enhancements
- Common Criteria Certification to check the security of the operating system
Only run containers from trusted parties. I believe you should continue to get your code/packages from the same people who you have gotten it from in the past. If the code does not come from internal or a trusted third party, do not rely on container technology to protect your host.
So what is the problem? Why don't containers contain?
The biggest problem is everything in Linux is not namespaced. Currently, Docker uses five namespaces to alter processes view of the system: Process, Network, Mount, Hostname, Shared Memory.
While these give the user some level of security it is by no means comprehensive, like KVM. In a KVM environment processes in a virtual machine do not talk to the host kernel directly. They do not have any access to kernel file systems like /sys and /sys/fs, /proc/*.
Device nodes are used to talk to the VMs Kernel not the hosts. Therefore in order to have a privilege escalation out of a VM, the process has to subvirt the VM's kernel, find a vulnerability in the HyperVisor, break through SELinux Controls (sVirt), which are very tight on a VM, and finally attack the hosts kernel.
When you run in a container you have already gotten to the point where you are talking to the host kernel.
Major kernel subsystems are not namespaced like:
- file systems under /sys
- /proc/sys, /proc/sysrq-trigger, /proc/irq, /proc/bus
Devices are not namespaced:
- /dev/sd* file system devices
- Kernel Modules
If you can communicate or attack one of these as a privileged process, you can own the system.