Create a Kubernetes cron job in OKD

Get started with OKD, a Kubernetes distribution formerly known as OpenShift Origin.
167 readers like this.
Kubernetes

Jason Baker. CC BY-SA 4.0.

It can be daunting to get started with Kubernetes and OKD (a Kubernetes distribution formerly known as OpenShift Origin). There are a lot of concepts and components to take in and understand. This tutorial walks through creating an example Kubernetes cron job that uses a service account and a Python script to list all the pods in the current project/namespace. The job itself is relatively useless, but this tutorial introduces many parts of the Kubernetes & OKD infrastructure. Also, the Python script is a good example of using the OKD REST API for cluster-related tasks.

This tutorial covers several pieces of the Kubernetes/OKD infrastructure, including:

  • Service accounts and tokens
  • Role-based access control (RBAC)
  • Image streams
  • BuildConfigs
  • Source-to-image (S2I)
  • Jobs and cron jobs (mostly the latter)
  • Kubernetes' Downward API

The Python script and example OKD YAML files for this tutorial can be found on GitHub; to use them, just replace the value: https://okd.host:port line in cronJob.yml with the hostname of your OKD instance.

Prerequisites

Required software

  • An OKD or MiniShift cluster with the default S2I image streams installed
  • An integrated image registry

Optional software (for playing with the Python script)

To install the modules, run:

pip3 install --user openshift kubernetes 

Authentication

The Python script will use the service account API token to authenticate to OKD. The script expects an environment variable pointing to the OKD host where it will connect:

HOST: OKD host to connect to (e.g., https://okd.host:port)

OKD will also need the token to be created automatically for the service account that will run the cron job pods (see "Set how environment variables and tokens are used" below).

Process

Setting up this sync is a good learning experience—since you need to set up a number of Kubernetes and OKD tasks, you get a good feel for its various features.

The general process is:

  1. Create a new project
  2. Create a Git repository with the Python script in it
  3. Create a service account
  4. Grant RBAC permissions to the service account
  5. Set how environment variables and tokens are used
  6. Create an image stream to accept the images created by the BuildConfig
  7. Create a BuildConfig to turn the Python script into an image/image stream
  8. Build the image
  9. Create the cron job

1. Create a new project

Create a new project in OKD for this exercise:

oc new-project py-cron

Depending on how your cluster is set up, you may need to ask your cluster administrator to create a new project for you.

2. Create a Git repository with the Python script

Clone or fork this repo:

https://github.com/clcollins/openshift-cronjob-example.git

You can also reference it directly in the code examples below. This will serve as the repository where the Python script will be pulled and built into the final running image.

3. Create a service account

A service account is a non-user account that can be associated with resources, permissions, etc., within OKD. For this exercise, you must create a service account to run the pod with the Python script in it, authenticate to the OKD API via token auth, and make REST API calls to list all the pods.

Since the Python script will query the OKD REST API to get a list of pods in the namespace, the service account will need permissions to list pods and namespaces. Technically, one of the default service accounts automatically created in a project—system:serviceaccount:default:deployer—already has these permissions. However, this exercise will create a new service account to explain service account creation and RBAC permissions.

Create a new service account by entering:

oc create serviceaccount py-cron

This creates a service account named, appropriately, py-cron. (Technically, it's system:serviceaccounts:py-cron:py-cron, or the "py-cron" service account in the "py-cron" namespace). The account automatically receives two secrets: an OKD API token and credentials for the OKD Container Registry. The API token will be used in the Python script to identify the service account to OKD for REST API calls.

The tokens associated with the service account can be viewed with the command:

oc describe serviceaccount py-cron

4. Grant RBAC permissions for the service account

OKD and Kubernetes use RBAC (role-based access control) to allow fine-grained control of who can do what in a complicated cluster. In RBAC:

  1. Permissions are based on verbs and resources (e.g., create group, delete pod, etc.)
  2. Sets of permissions are grouped into roles or ClusterRoles, the latter being, predictably, cluster-wide.
  3. Roles and ClusterRoles are associated with (or bound to) groups and service accounts (or, if you want to do it all wrong, individual users) by creating RoleBindings or ClusterRoleBindings.
  4. Groups, service accounts, and users can be bound to multiple roles.

For this exercise, create a role within the project, grant the role permissions to list pods and projects, and bind the py-cron service account to the role:

oc create role pod-lister --verb=list --resource=pods,namespaces
oc policy add-role-to-user pod-lister --role-namespace=py-cron system:serviceaccounts:py-cron:py-cron

Note that --role-namespace=py-cron has to be added to prevent OKD from looking for ClusterRoles.

Verify the service account has been bound to the role:

oc get rolebinding | awk 'NR==1 || /^pod-lister/'
NAME         ROLE                 USERS     GROUPS   SERVICE ACCOUNTS   SUBJECTS
pod-lister   py-cron/pod-lister                      py-cron

5. Set how environment variables and tokens are used

The tokens associated with the service account and various environment variables are referenced in the Python script as py-cron API token.

The API token automatically created for the py-cron service account is mounted by OKD into any pod the service account is running. This token is mounted to a specific path in every container in the pod:

/var/run/secrets/kubernetes.io/serviceaccount/token

The Python script reads this file and uses it to authenticate to the OKD API to manage the groups.

  • HOST environment variable: The HOST environment variable is specified in the cron job definition and contains the OKD API hostname in the format: https://okd.host:port.

  • NAMESPACE environment variable: The NAMESPACE environment variable is referenced in the cron job definition and uses the Kubernetes Downward API to dynamically populate the variable with the name of the project where the cron job pod is running.

6. Create an image stream

An image stream is a collection of images, in this case, created by the BuildConfig builds, and an abstraction layer between images and Kubernetes objects that allows them to reference the image stream rather than the image directly.

Before a newly built image can be pushed to an image stream, the stream must already exist. The easiest way to create a new, empty stream is with the oc command-line command:

oc create imagestream py-cron

7. Create a BuildConfig

The BuildConfig is the definition of the entire build process—the act of taking input parameters and code and turning them into an image.

The BuildConfig for this exercise will use the source-to-image (S2I) build strategy, using the Red Hat-provided Python S2I image, and adding the Python script to it where the requirements.txt is parsed and those modules are installed. This results in a final Python-based image with the script and the required Python modules to run it.

The important pieces of the BuildConfig are: .spec.output, .spec.source, and .spec.strategy.

.spec.output

The output section of the BuildConfig describes what to do with the build's output. In this case, the BuildConfig outputs the resulting image as an image stream tag (e.g., py-cron:1.0) that can be used in the deploymentConfig to reference the image.

These are probably self-explanatory.

spec:
  output:
    to:
      kind: ImageStreamTag
      name: py-cron:1.0

.spec.source

The source section of the BuildConfig describes where the content of the build comes from. In this case, it references the Git repository where the Python script and its supporting files are kept.

Most of these are self-explanatory as well.

spec:
  source:
    type: Git
    git:
      ref: master
      uri: https://github.com/clcollins/openshift-cronjob-example.git

.spec.strategy

The strategy section of the BuildConfig describes the build strategy to use, in this case, the source (i.e., S2I) strategy. The .spec.strategy.sourceStrategy.from section defines the public Python 3.6 image stream that exists in the default OpenShift namespace for anyone to use. This image stream contains S2I builder images that take Python code as input, install the dependencies listed in any requirements.txt files, then output a finished image with the code and requirements installed.

strategy:
  type: Source
  sourceStrategy:
    from:
      kind: ImageStreamTag
      name: python:3.6
      namespace: openshift

The complete BuildConfig for this example looks like the YAML below. Substitute your Git repo and create the BuildConfig with the oc command:

oc create -f <path.to.buildconfig.yaml>

The YAML:

---
apiVersion: build.openshift.io/v1
kind: BuildConfig
metadata:
  labels:
    app: py-cron
  name: py-cron
spec:
  output:
    to:
      kind: ImageStreamTag
      name: py-cron:1.0
  runPolicy: Serial
  source:
    type: Git
    git:
      ref: master
      uri: https://github.com/clcollins/openshift-cronjob-example.git
  strategy:
    type: Source
    sourceStrategy:
      from:
        kind: ImageStreamTag
        name: python:3.6
        namespace: openshift

8. Build the image

Most of the time, it would be more efficient to add a webhook trigger to the BuildConfig to allow the image to be automatically rebuilt each time code is committed and pushed to the repo. For this exercise, however, the image build will be kicked off manually whenever the image needs to be updated.

A new build can be triggered by running:

oc start-build BuildConfig/py-cron

Running this command outputs the name of a build; for example:

build.build.openshift.io/py-cron-1 started

The build's progress can be followed by watching the logs:

oc logs -f build.build.openshift.io/py-cron-1

When the build completes, the image will be pushed to the image stream listed in the .spec.output section of the BuildConfig.

9. Create the cron job

The Kubernetes cron job object defines the cron schedule and behavior as well as the Kubernetes job that is created to run the actual sync.

The important parts of the cron job definition are: .spec.concurrencyPolicy, .spec.schedule, and .spec.JobTemplate.spec.template.spec.containers.

.spec.concurrencyPolicy

The concurrencyPolicy field of the cron job spec is an optional field that specifies how to treat concurrent executions of a job that are created by this cron job. In this exercise, it will replace an existing job that may still be running if the cron job creates a new job.

Note: Other options are to allow concurrency (multiple jobs running at once) or forbid concurrency (new jobs are skipped until the running job completes).

.spec.schedule

The schedule field of the cron job spec is (unsurprisingly) a Vixie cron-format schedule. At the time(s) specified, Kubernetes will create a job, as defined in the JobTemplate spec below.

spec:
  schedule: "*/5 * * * *"

.spec.JobTemplate.spec.template.spec.containers

The cron job spec contains a JobTemplate spec and a template spec, which in turn contains a container spec. All of these follow the standard spec for their type, i.e., the .spec.containers section is just a normal container definition you might find in any other pod definition.

The container definition for this example is a straightforward container definition that uses the environment variables discussed above.

The only important part is: 

.spec.JobTemplate.spec.template.spec.containers.ServiceAccountName

This section sets the service account created earlier, py-cron, as the account running the containers. This overrides the default deployer service account.

The complete OKD cron job for py-cron looks like the YAML below. Substitute the URL of the OKD cluster API and create the cron job by using the following oc command: 

oc create -f <path.to.cronjob.yaml>

The YAML:

---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  labels:
    app: py-cron
  name: py-cron
spec:
  concurrencyPolicy: Replace
  failedJobsHistoryLimit: 1
  JobTemplate:
    metadata:
      annotations:
        alpha.image.policy.openshift.io/resolve-names: '*'
    spec:
      template:
        spec:
          containers:
          - env:
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: HOST
              value: https://okd.host:port
            image: py-cron/py-cron:1.0
            imagePullPolicy: Always
            name: py-cron
          ServiceAccountName: py-cron
          restartPolicy: Never
  schedule: "*/5 * * * *"
  startingDeadlineSeconds: 600
  successfulJobsHistoryLimit: 3
  suspend: false

Looking around

Once the cron job has been created, its component can be viewed with the oc get cronjob command. This shows a brief description of the cron job, its schedule and last run, and whether it is active or paused:

oc get cronjob py-cron
NAME      SCHEDULE    SUSPEND   ACTIVE    LAST SCHEDULE   AGE
py-cron   */5 * * * *   False     0         1m              7d

As mentioned, the cron job creates Kubernetes jobs to do the work whenever the scheduled time passes. The oc get jobs command lists the jobs that have been created by the cron job, the number of desired jobs (in this case, just one job per run), and whether the job was successful:

oc get jobs
NAME                 DESIRED   SUCCESSFUL   AGE
py-cron-1544489700   1         1            10m
py-cron-1544489760   1         1            5m
py-cron-1544489820   1         1            30s

The jobs are pods, which can be seen with the oc get pods command. In this example, you can see both the job pods (named after the job that created them, such as Job "py-cron-1544489760" created pod "py-cron-1544489760-xl4vt"). There are also build pods, or the pods that built the container image, as described in the BuildConfig above.

oc get pods
NAME                       READY     STATUS      RESTARTS   AGE
py-cron-1-build            0/1       Completed   0          7d
py-cron-1544489760-xl4vt   0/1       Completed   0          10m
py-cron-1544489820-zgfg8   0/1       Completed   0          5m
py-cron-1544489880-xvmsn   0/1       Completed   0          44s
py-cron-2-build            0/1       Completed   0          7d
py-cron-3-build            0/1       Completed   0          7d

Finally, because this example is just a Python script that connects to OKD's REST API to get information about pods in the project, oc get logs can be used verify that the script is working by getting the logs from the pod to see the output of the script written to standard output:

oc logs py-cron-1544489880-xvmsn
---> Running application from Python script (app.py) ...
ResourceInstance[PodList]:
  apiVersion: v1
  items:
  - metadata:
      annotations: {openshift.io/build.name: py-cron-1, openshift.io/scc: privileged}
      creationTimestamp: '2018-12-03T18:48:39Z'
      labels: {openshift.io/build.name: py-cron-1}
      name: py-cron-1-build
      namespace: py-cron
      ownerReferences:
      - {apiVersion: build.openshift.io/v1, controller: true, kind: Build, name: py-cron-1,
        uid: 0c9cf9a8-f72c-11e8-b217-005056a1038c}

<snip>

In summary

This tutorial showed how to create a Kubernetes cron job with OKD, a Kubernetes distribution formerly called OpenShift Origin. The BuildConfig described how to build the container image containing the Python script that calls the OKD REST API, and an image stream was created to describe the different versions of the image built by the builds. A service account was created to run the container, and a role was created to allow the service account to get pod information from OKD's REST API. A RoleBinding associated the role with the service account. Finally, a cron job was created to run a job—the container with the Python script—on a specified schedule.

What to read next
Chris Collins
Chris Collins is an SRE at Red Hat and an OpenSource.com Correspondent with a passion for automation, container orchestration and the ecosystems around them, and likes to recreate enterprise-grade technology at home for fun.

Comments are closed.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.