Steve Gordon co-wrote this article.
Containers and Kubernetes have been widely promoted as "disruptive" technologies that will replace everything that preceded them, most notably virtual machine (VM) management platforms such as vSphere and OpenStack. Instead, as with most platform innovations, Kubernetes is more often used to add a layer to (or complement) VMs. In this article, and in a presentation at SCALE16x, we'll be exploring two relatively new projects that aim to assist users in combining Kubernetes with virtualization: KubeVirt and Kata Containers.
Most organizations still have large existing investments in applications that run on virtualized hosts, infrastructure that runs them, and tools to manage them. We can envision this being true for a long time to come, just as remnants of previous generations of technology remain in place now. Additionally, VM technology still offers a level of isolation that container-enablement features, like user namespaces, have yet to meet. However, those same organizations want the ease-of-use, scalability, and developer appeal of Kubernetes, as well as a way to gradually transition from virtualized workloads to containerized ones.
Kubernetes' recently added Service Catalog feature addresses integration to an extent. This feature allows Kubernetes to create "endpoints" so containerized applications can route requests to applications running elsewhere, such as on an OpenStack cluster. However, this approach doesn't provide a way to unify management of virtualized and containerized applications, nor does it provide any way for Kubernetes applications to be a part of the VM platform's environment and namespace.
Developers have launched several projects in the last year to meet these integration requirements. While there are many differences in technical details, at a high level these projects can be divided into two contrasting usage targets:
- Running traditional VM workloads alongside application containers as part of a complex application
- Running application container workloads with hardware-assisted VM-level isolation for security and/or resource management
These use cases imply different types of source images and user expectations in terms of workflow, startup speed, and memory overhead.
In the traditional VM workload, the user expects to be able to run an existing VM image—or at least trivially convert one—that contains a full operating system and the application and comes with the boot time and memory overhead that implies. This may even be a different operating system from the host, and it may have specific requirements regarding virtual devices and memory addresses that can't be realized using containers.
In the application container workload case, users expect to be able to run an Open Containers Initiative-compliant container image as a VM. They also expect a fast launch like any other application container and a low memory overhead that more directly maps to what the application will use. Their reason for using a VM is that they want hardware-enforced isolation of workloads, for either security or performance.
Let's compare two such projects: KubeVirt for the traditional VM use case and Kata Containers for the isolation use case.
KubeVirt for traditional VMs
The KubeVirt project was launched by three Red Hat engineers in late 2016. The project's goal was to help enterprises move from a VM-based infrastructure to a Kubernetes-and-container-based stack, one application at a time. This meant providing a mechanism to treat applications built by existing VM development workflows like native Kubernetes applications, including management and routing. At the same time, many of these applications require a significant amount of VM-native configuration to function.
Kubernetes 1.7, released in June 2017, included Custom Resource Definitions (CRDs), the project's most powerful extension API. CRDs basically let developers create an entirely new type of object for the Kubernetes system and define the characteristics and functionality of that object, then load it into a running cluster. For the KubeVirt contributors, who wanted its VMs to behave like VMs, CRD was the perfect interface, and the project was refactored around it.
Users who have existing KVM-based VM workflows would need to add one automated step, which makes some changes to the image for the KubeVirt environment. Once that's done, the image can be deployed to the Kubernetes cluster using a YAML manifest, just like other Kubernetes objects. KubeVirt "wraps" each deployed VM in a pod, the basic unit of deployment in Kubernetes, that would normally contain one to several containers instead of a VM. This allows for almost-seamless integration with other services deployed as normal container-based pods. Assigned virtual disks become Kubernetes Persistent Volumes and can take advantage of any storage already available to the cluster.
The CRD mechanism also means that the KubeVirt project can define extra "knobs" for tuning behaviors that aren't available in regular containers, such as virtual devices and CPU characteristics. While not all this functionality is complete, existing features include:
- Creation of VMs using different operating systems, including Windows
- Configuration of CPU, RAM, and a virtual disk for each VM
- Connecting the VM directly to a serial or network console
- Using cloud-init for VM boot-time configuration
More importantly, KubeVirt VMs can be examined and controlled using Kubernetes' standard command-line interface, kubectl, as if they were regular pods. Because administrators also sometimes need to connect to the "console" for the VM, KubeVirt comes with an additional CLI tool, virtctl, which supports this:
virtctl console --kubeconfig=$KUBECONFIG testvm
The KubeVirt project is in rapid development and looking for more contributors. You can connect with the team via its GitHub repo and the forums mentioned there, or in the #virtualization channel in Kubernetes' Slack.
Kata Containers: VMs for isolation
Like VMs, containers are used to run application clouds. Container clouds are popular because of very rapid, automated container deployment and startup, as well as lower CPU and memory usage. These enable development and management models that are unattainable in a VM environment. However, secure isolation of workloads is also a goal of many cloud owners, which is more difficult to achieve in a container stack. The Kata Containers project aims to provide the security isolation of VMs with the speed of containers.
The Linux kernel on the host machine isolates workloads running in various application containers from each other using kernel namespaces, as well as various kernel process confinement technologies like SELinux and SECCOMP. While namespace isolation in the kernel continues to improve, it still does not—and arguably never will—offer the same level of isolation as hardware-assisted virtualization running on modern CPUs. Both hostile scripts and application bugs in a container environment will still have access to any unpatched kernel vulnerabilities, if nothing else.
A KubeVirt-like approach provides VM-level security isolation, but it also burdens administrators with additional management, slower deployments, and slower launch times for the VM pods. For developers who want rapid, container-based development workflows and don't care about compatibility with legacy platforms, that's a poor tradeoff. Kata Containers takes a different approach to gain container-like speed, using a stripped-down VM platform and a different Kubernetes API.
Intel launched a container project called Clear Containers in 2015. Part of Intel's Clear Linux initiative, Clear Containers implemented an approach to secure containers that took advantage of Intel CPU virtualization hardware. Concurrently, a Beijing-based startup called HyperHQ launched Hyper, which took a similar approach to solving the container isolation problem. At KubeCon, in December 2017, the two projects announced they were merging as the Kata Containers project, hosted by the OpenStack Foundation.
Since the Kata Containers project is still in early development, we'll explain the workflows and architecture of Clear Containers 3.0, with the belief that the eventual release version of Kata Containers will be very similar.
Rather than providing top-level API objects via a CRD like KubeVirt, Clear Containers 3.0 provides an Open Containers Initiative (OCI)-compatible runtime implementation called cc-runtime. It then plugs into Kubernetes using a different extension API, the Container Runtime Interface (CRI), which provides Kubernetes with its ability to run on different container platforms, including Docker, containerd, rkt, and CRI-O. In effect, Clear Containers VMs are treated by Kubernetes as just a different "flavor" of container. Practically, this does also entail some extra configuration of each Kubernetes node.
The cc-runtime "containers" run a "mini O/S" image, which, although technically a full VM, contains only a cut-down Linux kernel and little other operating system machinery. Combined with optimizations added to Intel's virtualization hardware, this allows the container VMs to launch much faster than traditional VMs, while also using fewer system resources. While they are slower than regular Linux containers, they are fast enough to integrate into container-based development workflows.
In a Kubernetes-Clear Containers cluster, pods can be specified as either trusted or untrusted. Trusted pods use a regular runc-based container runtime. For untrusted pods, the runtime creates a lightweight VM and then spawns all the containers in the pod inside that VM. Since Kubernetes clusters can be configured as trusted or untrusted by default, this means that administrators can add the additional isolation of hardware-assisted virtualization without having to change the workload definitions at all.
In the coming months, Intel, HyperHQ, and a growing number of supporting companies will be working to merge the Clear Containers and Hyper implementations into the new Kata Containers project.
VM-container fusion and the future
While both KubeVirt and Kata Containers share ideas, some of their supporting companies, and a desire to integrate VMs and containers, it seems unlikely that the two fundamentally different approaches can ever be reconciled. KubeVirt aims to provide as much VM functionality as possible, while Kata Containers tries to provide a container-like experience for VMs. As with other irreconcilable architecture tradeoffs, we can expect to see some users employing both approaches concurrently for different workloads.
There are, of course, other tools and implementations that aim to fuse VMs and containers, such as Virtlet and RancherVM. Generally speaking, these projects take one of the two approaches above, although they may have specific details that differ from KubeVirt and Kata Containers. Regardless of the tools employed, both VM admins who want to move to Kubernetes and container developers who want better isolation can look forward to an easier-to-manage and a more unified future.
You can learn more about Kubernetes and VMs at You Got Your VM In My Container, a presentation by Josh Berkus and Jason Brooks at the Southern California Linux Expo, including demos and sample code. In addition, both KubeVirt and Kata Containers are looking for contributors. Find them on their GitHub pages or on the #virtualization channel on Kubernetes Slack.
To attend SCALE16x and get 50% of your ticket, register using promo code OSDC.