Have you wanted to cause chaos to test your systems but prefer to use visual tools rather than the terminal? Well, this article is for you, my friend. In the first article in this series, I explained what chaos engineering is; in the second article, I demonstrated how to get your system's steady state so that you can compare it against a chaos state; and in the third, I showed how to use Litmus to test arbitrary failures and experiments in your Kubernetes cluster.
The fourth article introduces Chaos Mesh, an open source chaos orchestrator with a web user interface (UI) that anyone can use. It allows you to create experiments and display statistics in a web UI for presentations or visual storytelling. The Cloud Native Computing Foundation hosts the Chaos Mesh project, which means it is a good choice for Kubernetes. So let's get started! In this walkthrough, I'll use Pop!_OS 20.04, Helm 3, Minikube 1.14.2, and Kubernetes 1.19.
Configure Minikube
If you haven't already, install Minikube in whatever way that makes sense for your environment. If you have enough resources, I recommend giving your virtual machine a bit more than the default memory and CPU power:
$ minikube config set memory 8192
❗ These changes will take effect upon a minikube delete and then a minikube start
$ minikube config set cpus 6
❗ These changes will take effect upon a minikube delete and then a minikube start
Then start and check the status of your system:
$ minikube start
? minikube v1.14.2 on Debian bullseye/sid
? minikube 1.19.0 is available! Download it: https://github.com/kubernetes/minikube/releases/tag/v1.19.0
? To disable this notice, run: 'minikube config set WantUpdateNotification false'
✨ Using the docker driver based on user configuration
? Starting control plane node minikube in cluster minikube
? Creating docker container (CPUs=6, Memory=8192MB) ...
? Preparing Kubernetes v1.19.0 on Docker 19.03.8 ...
? Verifying Kubernetes components...
? Enabled addons: storage-provisioner, default-storageclass
? Done! kubectl is now configured to use "minikube" by default
$ minikube status
minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured
Install Chaos Mesh
Start installing Chaos Mesh by adding the repository to Helm:
$ helm repo add chaos-mesh https://charts.chaos-mesh.org
"chaos-mesh" has been added to your repositories
Then search for your Helm chart:
$ helm search repo chaos-mesh
NAME CHART VERSION APP VERSION DESCRIPTION
chaos-mesh/chaos-mesh v0.5.0 v1.2.0 Chaos Mesh® is a cloud-native Chaos Engineering...
Once you find your chart, you can begin the installation steps, starting with creating a chaos-testing
namespace:
$ kubectl create ns chaos-testing
namespace/chaos-testing created
Next, install your Chaos Mesh chart in this namespace and name it chaos-mesh
:
$ helm install chaos-mesh chaos-mesh/chaos-mesh --namespace=chaos-testing
NAME: chaos-mesh
LAST DEPLOYED: Mon May 10 10:08:52 2021
NAMESPACE: chaos-testing
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
1. Make sure chaos-mesh components are running
kubectl get pods --namespace chaos-testing -l app.kubernetes.io/instance=chaos-mesh
As the output instructs, check that the Chaos Mesh components are running:
$ kubectl get pods --namespace chaos-testing -l app.kubernetes.io/instance=chaos-mesh
NAME READY STATUS RESTARTS AGE
chaos-controller-manager-bfdcb99fd-brkv7 1/1 Running 0 85s
chaos-daemon-4mjq2 1/1 Running 0 85s
chaos-dashboard-865b778d79-729xw 1/1 Running 0 85s
Now that everything is running correctly, you can set up the services to see the Chaos Mesh dashboard and make sure the chaos-dashboard
service is available:
$ kubectl get svc -n chaos-testing
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
chaos-daemon ClusterIP None <none> 31767/TCP,31766/TCP 3m42s
chaos-dashboard NodePort 10.99.137.187 <none> 2333:30029/TCP 3m42s
chaos-mesh-controller-manager ClusterIP 10.99.118.132 <none> 10081/TCP,10080/TCP,443/TCP 3m42s
Now that you know the service is running, go ahead and expose it, rename it, and open the dashboard using minikube service
:
$ kubectl expose service chaos-dashboard --namespace chaos-testing --type=NodePort --target-port=2333 --name=chaos
service/chaos exposed
$ minikube service chaos --namespace chaos-testing
|---------------|-------|-------------|---------------------------|
| NAMESPACE | NAME | TARGET PORT | URL |
|---------------|-------|-------------|---------------------------|
| chaos-testing | chaos | 2333 | http://192.168.49.2:32151 |
|---------------|-------|-------------|---------------------------|
? Opening service chaos-testing/chaos in default browser...
When the browser opens, you'll see a token generator window. Check the box next to Cluster scoped, and follow the directions on the screen.
Then you can log into Chaos Mesh and see the Dashboard.
You have installed your Chaos Mesh instance and can start working towards chaos testing!
Get meshy in your cluster
Now that everything is up and running, you can set up some new experiments to try. The documentation offers some predefined experiments, and I'll choose StressChaos from the options. In this walkthrough, you will create something in a new namespace to stress against and scale it up so that it can stress against more than one thing.
Create the namespace:
$ kubectl create ns app-demo
namespace/app-demo created
Then create the deployment in your new namespace:
$ kubectl create deployment nginx --image=nginx --namespace app-demo
deployment.apps/nginx created
Scale the deployment up to eight pods:
$ kubectl scale deployment/nginx --replicas=8 --namespace app-demo
deployment.apps/nginx scaled
Finally, confirm everything is up and working correctly by checking your pods in the namespace:
$ kubectl get pods -n app-demo
NAME READY STATUS RESTARTS AGE
nginx-6799fc88d8-7kphn 1/1 Running 0 69s
nginx-6799fc88d8-82p8t 1/1 Running 0 69s
nginx-6799fc88d8-dfrlz 1/1 Running 0 69s
nginx-6799fc88d8-kbf75 1/1 Running 0 69s
nginx-6799fc88d8-m25hs 1/1 Running 0 2m44s
nginx-6799fc88d8-mg4tb 1/1 Running 0 69s
nginx-6799fc88d8-q9m2m 1/1 Running 0 69s
nginx-6799fc88d8-v7q4d 1/1 Running 0 69s
Now that you have something to test against, you can begin working on the definition for your experiment. Start by creating chaos-test.yaml
:
$ touch chaos-test.yaml
Next, create the definition for the chaos test. Just copy and paste this experiment definition into your chaos-test.yaml
file:
apiVersion: chaos-mesh.org/v1alpha1
kind: StressChaos
metadata:
name: burn-cpu
namespace: chaos-testing
spec:
mode: one
selector:
namespaces:
- app-demo
labelSelectors:
app: "nginx"
stressors:
cpu:
workers: 1
duration: '30s'
scheduler:
cron: '@every 2m'
This test will burn 1 CPU for 30 seconds every 2 minutes on pods in the app-demo
namespace. Finally, apply the YAML file to start the experiment and view what happens in your dashboard.
Apply the experiment file:
$ kubectl apply -f chaos-test.yaml
stresschaos.chaos-mesh.org/burn-cpu created
Then go to your dashboard and click Experiments to see the stress test running. You can pause the experiment by pressing the Pause button on the right-hand side of the experiment.
Click Dashboard to see the state with a count of total experiments, the state graph, and a timeline of running events or previously run tests.
Choose Events to see the timeline and the experiments below it with details.
Congratulations on completing your first test! Now that you have this working, I'll share more details about what else you can do with your experiments.
But wait, there's more
Other things you can do with this experiment using the command line include:
- Updating the experiment to change how it works
- Pausing the experiment if you need to return the cluster to a steady state
- Resuming the experiment to continue testing
- Deleting the experiment if you no longer need it for testing
Updating the experiment
As an example, update the experiment in your cluster to increase the duration between tests. Go back to your cluster-test.yaml
and edit the scheduler to change 2 minutes to 20 minutes:
Before:
scheduler:
cron: '@every 2m'
After:
scheduler:
cron: '@every 20m'
Save and reapply your file; the output should show the new stress test configuration:
$ kubectl apply -f chaos-test.yaml
stresschaos.chaos-mesh.org/burn-cpu configured
If you look in the Dashboard, the experiment should show the new cron configuration.
Pausing and resuming the experiment
Manually pausing the experiment on the command line will require adding an annotation to the experiment. Resuming the experiment will require removing the annotation.
To add the annotation, you will need the kind, name, and namespace of the experiment from your YAML file.
Pause an experiment:
$ kubectl annotate stresschaos burn-cpu experiment.chaos-mesh.org/pause=true -n chaos-testing
stresschaos.chaos-mesh.org/burn-cpu annotated
The web UI shows it is paused.
Resume an experiment
You need the same information to resume your experiment. However, rather than the word true
, you use a dash to remove the pause.
$ kubectl annotate stresschaos burn-cpu experiment.chaos-mesh.org/pause- -n chaos-testing
stresschaos.chaos-mesh.org/burn-cpu annotated
Now you can see the experiment has resumed in the web UI.
Remove an experiment
Removing an experiment altogether requires a simple delete
command with the file name:
$ kubectl delete -f chaos-test.yaml
stresschaos.chaos-mesh.org "burn-cpu" deleted
Once again, you should see the desired result in the web UI.
Many of these tasks were done with the command line, but you can also create your own experiments using the UI or import experiments you created as YAML files. This helps many people become more comfortable with creating new experiments. There is also a Download button for each experiment, so you can see the YAML file you created by clicking a few buttons.
Final thoughts
Now that you have this new tool, you can get meshy with your environment. Chaos Mesh allows more user-friendly interaction, which means more people can join the chaos team. I hope you've learned enough here to expand on your chaos engineering. Happy pod hunting!
Comments are closed.