Abstractions and metadata are the future of architecture in systems engineering, as they were before in software engineering. In many languages, there are abstractions and metadata; however, systems engineering has never adopted this view. Systems were always thought of as too unique for any standard abstractions. Now that we’ve standardized the lower-level abstractions, we’re ready to build new system-level abstractions.
There be dragons
When discussing abstractions, starting with a healthy dose of skepticism is important. Andrew Koenig stated, "Abstraction is selective ignorance." And Joel Spolsky coined the term "Law of Leaky Abstractions" when he described how all abstractions leak that which they abstract.
Know that you're choosing to be ignorant of a system when you abstract it. This doesn't mean everyone is ignorant of the underlying system, but it does mean you'll have less insight into the system. For example, Amazon Web Services and Google Cloud Platform allow you to abstract away physical and even virtual servers. You won't know anything about the underlying physical hosts or related network; however, that abstraction can leak. The Amazon EC2 outage of 2011 is an example of the abstraction leaking an underlying failure. This means that you still need to be familiar with the fundamentals and how the abstraction works. Also, a team in your company should have a complete understanding of the abstractions that are operated internally; this can't be reliably outsourced.
The road to platform
Data centers today are messy affairs. An application is likely to have different operating system and middleware versions in each environment. The development environment will have the latest versions, as change here is more acceptable. Production will have the oldest versions, as change here is feared. And each application will have a different combination from any other application in the system. As a result, failures due to these inconsistencies are frequent, change is seen as the source of failure, and change is further restricted.
The modern data center is based on abstractions. The primary abstraction of the physical layer is a platform. The platform allows for interactions with compute, storage, and network using APIs and higher-level objects. The compute resources an application now sees are in the form of immutable Docker images running as containers on an underlying virtual or physical host. The application is packaged with the operating system and middleware that it has been tested against, and the same image is deployed into each environment. The application now gets its environment-specific variables automatically from the environment at deployment time rather than having them packaged into the application or provided via manual interactions.
This model allows for much greater consistency and repeatability across environments, and it increases agility, as change is no longer seen as the cause of all problems. It is now the solution to the problems through faster recoveries. Instead of optimizing for stability and getting neither rapid change nor stability, we now optimize for change and get both rapid change and stability.
The immutable image format that facilitates this new model comes from an open source project called Docker. A Docker container is an abstraction of an underlying host. It consists of a layered filesystem where each layer is immutable. A common pattern is to have an operating system layer, a middleware layer, and then an application layer. Each layer obfuscates the layer below it. If an application layer contains a file that exists in the middleware layer, then the application layer file is the only one seen when the image is started. The middleware layer file still exists, but it can't be seen or used.
This can cause a lot of leaky abstraction issues if you don't understand how Docker works. For example, if you copy a bunch of files in one layer and chmod them in the next layer, then all files will exist in the image twice. This can quickly add up and cause a lot of problems in the system.
Docker also uses Linux features like cgroups, which Google contributed about a decade ago, and namespaces. At their most basic, cgroups determine how many resources a process can consume, and namespaces determine who can consume them.
Additional abstractions have also been added into Docker over the years, including networks, volumes, secrets, and services. They also added labels to allow for assigning metadata. Metadata is very important in these modern, abstracted, distributed systems. We can no longer discuss machines based solely on their name, location, or address. We now describe and reference them based on multiple attributes like location, type, function, features, etc. This results in a more flexible abstraction.
Kubernetes has taken full advantage of this new model of flexible abstractions. It was originally built on top of Docker, but it has now abstracted away the compute unit so that even virtual machines can be used as the instance container (though this is highly experimental). Kubernetes was created by Google as an open source project based on its internal cluster management system called Borg.
Kubernetes has similar abstractions to Docker, such as volumes, services, and secrets; however, Kubernetes also has an abstraction called a pod. A pod is a grouping of containers that should be colocated and share storage and network. It is the smallest deployable unit in Kubernetes, whereas a container is the smallest deployable unit in Docker.
Kubernetes also leverages a plugin system that offers an abstraction for network and storage in addition to compute. The network system then uses a Network Policy to describe connections based on metadata. Each object in Kubernetes can have labels attached. These labels are used to not only describe an object, but also to select that object and all objects with matching labels from among the unwashed masses. A Network Policy uses these labels to apply policies to objects with matching labels.
OpenShift extends this further by adding more abstractions, including BuildConfigs and ImageStreams. A BuildConfig is used to describe how an application should be built to include the image used to build the application, the image to be used for running the application, where the image should be stored, and when a build should be initiated. ImageStreams are an image-registry abstraction. ImageStreams can reference images stored in the OpenShift integrated registry, Docker Hub, or an internal company registry. All of these references and the related data are abstracted away from the end user, which greatly simplifies image management from an application developer's perspective.
Creating a holistic configuration
Everything described thus far has custom configuration documents with very particular knowledge required to complete them. This was a challenge for my engineering team, as we sought to bring these technologies to a financial and health services enterprise with thousands of developers. We'd have to teach each developer multiple different formats, and if we changed formats, we'd have to retrain everyone. We were also faced with the challenge that we'd still have workloads outside our container orchestrator. So, we created our own abstraction around higher-order objects. We have created a system that allows a developer or administrator to describe an application from birth to death.
There are a couple types of documents that provide this capability. We use a namespace.yaml to allocate resources to a set of objects. This document must be approved by an individual with the right level of spending authority to cover the estimated costs of the resources requested. Once those resources are approved, any application or other object inside that namespace can utilize those resources until they're fully consumed.
- name: dev
- name: test
A second level of document describes a specific object. There are multiple types, such as application, database, and document. Everything has a pipeline. These documents describe the resources required for the specific object, the relationship of the object to other named objects, the environments in which this application will run, the way the application should be built and tested, and how it should be deployed and run. This document is then converted into multiple documents related to specific technologies, like a Jenkinsfile for Jenkins and a Deployment for Kubernetes. These are then referenced to ensure that the expected state of each environment is maintained.
- name: SHARED_ENV
value: 'shared value'
- name: ANOTHER_ENV
value: 'another value'
- name: https
- name: shared-data
- name: authn
- name: dev
- name: ENVIRONMENT_SPECIFIC
value: 'dev value'
This abstraction allows our developers to stay focused on creating business value, while a central team can utilize a single document interface to move an application from GitLab to production. Our tools can now be changed as needed without any changes from developers. It is the central team's responsibility to maintain the contract through these documents so that the developer experience doesn't change as tools change. Because this system has worked so well for us, we hope to open source it in the near future.
For a deeper dive into this topic, attend Daniel's talk, Architecting the Future: Abstractions and Metadata, at All Things Open, Oct. 23-24, in Raleigh, N.C.
Get the highlights in your inbox every week.