Just say no to root (in containers)

Even smart admins can make bad decisions.
418 readers like this.
tree roots breaking through brick wall

Rikki Endsley. CC BY-SA 4.0

I get asked all the time about the different security measures used to control what container processes on a system can do. Most of these I covered in previous articles on Opensource.com:

Almost all of the above articles are about controlling what a privileged process on a system can do in a container. For example: How do I run root in a container and not allow it to break out?

User namespace is all about running privileged processes in containers, such that if they break out, they would no-longer be privileged outside the container. For example, in the container my UID is 0 (zero), but outside of the container my UID is 5000. Due to problems with the lack of file system support, user namespace has not been the panacea it is cracked up to be. Until now.

OpenShift is Red Hat's container platform, built on Kubernetes, Red Hat Enterprise Linux, and OCI containers, and it has a great security feature: By default, no containers are allowed to run as root. An admin can override this, otherwise all user containers run without ever being root. This is particularly important in multi-tenant OpenShift Kubernetes clusters, where a single cluster may be serving multiple applications and multiple development teams. It is not always practical or even advisable for administrators to run separate clusters for each. Sadly one of the biggest complaints about OpenShift is that users can not easily run all of the community container images available at docker.io. This is because the vast majority of container images in the world today require root.

Why do all these images require root?

If you actually examine the reasons to be root, on a system, they are quite limited.

Modify the host system:

  • One major reason for being root on the system is to change the default settings on the system, like modifying the kernel's configuration.
  • In Fedora, CentOS, and Red Hat Enterprise Linux, we have the concept of system containers, which are privileged containers that can be installed on a system using the atomic command. They can run fully privileged and are allowed to modify the system as well as the kernel. In the case of system containers we are using the container image as a content delivery system, not really looking for container separation. System containers are more for the core operating system host services as opposed to the users app services that most containers run.
  • In application containers, we almost never want the processes inside the container to modify the kernel. This is definitely not required by default.

Unix/Linux tradition:

  • Operating system software vendors and developers have known for a long time running processes as root is dangerous, so the kernel added lots of Linux capabilities to allow a process to start as root and then drop privileges as quickly as possible. Most of the UID/GID capabilities allow a processes like a web service to start as root, then become non-root. This is done to bind to ports below 1024 (more on this later).
  • Container runtimes can start applications as non-root to begin with. Truth be known, so can systemd, but most software that has been built over the past 20 years assumes it is starting as root and dropping privileges.

Bind to ports < 1024

  • Way back in the 1960s and 1970s when there were few computers, the inability of unprivileged users to bind to network ports < 1024 was considered a security feature. Because only an admin could do this, you could trust the applications listening on these ports. Ports > 1024 could be bound by any user on the system so they could not be trusted. The security benefits of this are largely gone, but we still live with this restriction.
  • The biggest problem with this restriction is the web services where people love to have their web servers listening on port 80. This means the main reason apache or nginx start out running as root is so that they can bind to port 80 and then become non-root.
  • Container runtimes, using port forwarding, can solve this problem. You can set up a container to listen on any network port, and then have the container runtime map that port to port 80 on the host.

In this command, the podman runtime would run an apache_unpriv container on your machine listening on port 80 on the host, while the Apache process inside the container was never root, started as the apache user, and told to listen on port 8080.

podman run -d -p 80:8080 -u apache apache_unpriv

Alternatively:

docker run -d -p 80:8080 -u apache apache_unpriv

Installing software into a container image

  • When Docker introduced building containers with docker build, the content in the containers was just standard packaged software for distributions. The software usually came via rpm packages or Debian packages. Well, distributions package software to be installed by root. A package expects to be able to do things like manipulate the /etc/passwd file by adding users, and to put down content on the file system with different UID/GIDs. A lot of the software also expects to be started via the init system (systemd) and start as root and then drop privileges after it starts.
  • Sadly, five years into the container revolution, this is still the status quo. A few years ago, I attempted to get the httpd package to know when it is being installed by a non-root user and to have a different configuration. But I dropped the ball on this. We need to have packagers and package management systems start to understand the difference, and then we could make nice containers that run without requiring root.
  • One thing we could do now to fix this issue is to separate the build systems from the installing systems. One of my issues with #nobigfatdaemons is that the Docker daemon led to the containers running with the same privs for running a container as it did for building a container image.
  • If we change the system or use different tools, say Buildah, for building a container with looser constraints and CRI-O/Kubernetes/OpenShift for running the containers in production, then we can build with elevated privileges, but then run the containers with much tighter constraints,or hopefully as a non-root user.

Bottom line

Almost all software you are running in your containers does not require root. Your web applications, databases, load balancers, number crunchers, etc. do not need to be run as root ever. When we get people to start building container images that do not require root at all, and others to base their images off of non-privileged container images, we would see a giant leap in container security.

There is still a lot of educating needing to be done around running root inside of containers. Without education, smart administrators can make bad decisions.

User profile image.
Daniel Walsh has worked in the computer security field for almost 30 years. Dan joined Red Hat in August 2001.

5 Comments

A package expects to be able to do things like manipulate the /etc/passwd file by adding users, and to put down content on the file system with different UID/GIDs. A lot of the software also expects to be started via the init system (systemd) and start as root and then drop privileges after it starts. http://www.animeflvapkapp.com/

This article is useful for know about the root security of the containers. I got the knowledge of working of containers with the root behavior with the kernel.

I think this post sounds logical but reaches an incorrect conclusion. It’s true that the first thing people think of when discussing the “need” for root access is port numbers under 1024 and modifying kernel settings. But the post uncritically accepts the notion that “users should have the minimum access rights needed to do their job.” This idea, or best practice is widely accepted today, but it’s implications are rarely discussed. The increase in this thinking can and does significantly increase risk of production failures and security incidents.

Here’s why: when a production failure is thoroughly investigated it’s common that there are many contributory root causes - often four, five or more. My experience is that human ignorance will often be part of at least two or three of those factors. When “system administrator” became an established career path it became less common for programmers to have root privileges - all for for very logical, good intentions. The unintended consequence of this is that developers, as a group, have less understanding of how their software actually behaves in production. Linux has evolved and become a much more observable operating system, with sophisticated tracing tools - many of which require root. At the same time, today’s multisocket, multicore hardware is more powerful, and more complex. Yet as hardware becomes more powerful it’s clear that performance and security seem to remain stationary. For application software to make full use of having desire developers need to understand how their code interacts with the machine - something that the popularity of VMs, containers, and Java (3 huge positive developments) all obscure. Unfortunately following security best practices can lead us to less secure environments where no one person understands the entire system.

Just to join the dots further... Here’s what I mean by a developer understanding how a system uses resources:
“Application X has over 100 threads. Three are hot threads that spin on a core- the market data event handler and the two worker threads. Most of the remaining threads are thread per connection threads that are cold most of the time. Then there are four warm threads - the logger, the persister... We want the market data event handler to always run on socket one bevsuse the market data NIC is on the second PCI-X slot and that thread can stay NUMA local. This app is one where Latency is more important than throughput so we don’t use the default NIC Interrupt coalescing settings,...”

The best way to validate that these preconditions are true is to use ethtool, perf-test (both require root). So the question shouldn’t be “do we need root access?” but rather “does it make sense to have root access?” I have seen far more damage caused by developers and SAs not understanding how their systems behave than by the occasional human errors made at a root shell. I think that root access should be audited - one of the best learning experiences about a host can be simply to execute:
sudo -s
history | more

In reply to by Peter Booth

Just to join the dots further... Here’s what I mean by a developer understanding how a system uses resources:
“Application X has over 100 threads. Three are hot threads that spin on a core- the market data event handler and the two worker threads. Most of the remaining threads are thread per connection threads that are cold most of the time. Then there are four warm threads - the logger, the persister... We want the market data event handler to always run on socket one bevsuse the market data NIC is on the second PCI-X slot and that thread can stay NUMA local. This app is one where Latency is more important than throughput so we don’t use the default NIC Interrupt coalescing settings,...”

The best way to validate that these preconditions are true is to use ethtool, perf-test (both require root). So the question shouldn’t be “do we need root access?” but rather “does it make sense to have root access?” I have seen far more damage caused by developers and SAs not understanding how their systems behave than by the occasional human errors made at a root shell. I think that root access should be audited - one of the best learning experiences about a host can be simply to execute:
sudo -s
history | more

In reply to by Peter Booth

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.