Failure is a feature in blameless DevOps

In blameless DevOps culture, failure is more than an option; it's our friend.
133 readers like this.
failure sign at a party, celebrating failure

DevOps is just another term for value stream development. What does value stream mean?

Value is what arises during our interactions with customers and stakeholders. Once we get into value stream development, we quickly realize that value is not an entity. Value constantly changes. Value is a process. Value is a flow.

Hence the term stream. Value is only value if it's a stream. And this streaming of value is what we call continuous integration (CI).

How do we generate value?

No matter how carefully we specify value, its expectations tend to change. Therefore, the only realistic way to define and generate value is to solicit feedback.

But it's obvious that no one is volunteering feedback. People are busy. We need to solicit feedback from our customers and stakeholders, but somehow, they always have something more pressing to do. Even if we throw a tantrum and insist that they stop what they're doing and give us much-needed feedback, at best we'd get a few lukewarm comments. Very little to go by. People are busy.

We slowly learn that the most efficient and effective way to solicit feedback is to fail. Failure is a sure-fire way to make our customers and stakeholders drop everything, sit up, and pay attention. If we refuse to fail, we continue marching down the development path confidently, only to discover later that we're in the wrong.

Agile DevOps culture is about dropping this arrogant stance and adopting the attitude of humility. We admit that we don't know it all, and we commit to a more humble approach to working on the value stream.

It is of paramount importance to fail as soon as possible. That way, failure is not critical; it is innocuous, easy to overcome, easy to fix. But we need feedback to know how to fix it. The best feedback is reaction to failure.

Let's illustrate this dynamic visually:

Value generation via feedback loop

Value generation via a feedback loop from continuous solicitation

This figure illustrates the dynamics of producing value by soliciting feedback in a continuous, never-ending fashion.

Where does failure fit?

Where in the above process do we see failure? Time for another diagram:

Failure is central to feedback loop

Failure is the central driving force enabling the delivery of value stream.

Failure is center stage. Without failure, nothing useful ever gets done. From this, we conclude that failure is our friend.

How do we know we failed?

In the good bad old days of waterfall methodology, the prime directive was "Failure is not an option." We worked under the pressure that every step must be a fully qualified success. We were going out of our way to avoid getting any feedback. Feedback was reserved for the momentous Big Bang event; the point when we all got an earful on how much the system we built missed the mark.

That was, in a nutshell, the traditional way of learning that we failed. With the advent of agile and DevOps, we underwent cultural transformation and embraced incremental, iterative development processes. Each iteration starts with a mini failure, fixes it, and keeps going (mini being the keyword here). But how do we know if we failed?

The only way to know for sure is to have a measurable test or goal. The measurable test will let us know if—and how—we failed.

Now that we have set the stage and exposed the fundamentals of the blameless, failure-centric culture, the next article in this series will dive into a more detailed exposition on how to iterate over failed attempts to satisfy measurable tests and goals.

User profile image.
Alex has been doing software development since 1990. His current passion is how to bring soft back into software. He firmly believes that our industry has reached the level of sophistication where this lofty goal (i.e. bringing soft back into software) is fully achievable.


One thing I wonder about is how to navigate differing cultures between a team and external stakeholders. If your team accepts failure as a key way to gather feedback, how do you work across other stakeholders and clients who might read failure as incompetency?

That's a very good question. Agile DevOps depends on broken silos. If the silos are still going strong, it would be impossible to commit to value stream delivery. With silos dominating the business landscape, any attempt at streaming value is doomed to turn into a waterfall.

We break silos by not only removing any barriers between development and operation, but also by removing barriers between the engineering teams and the business/stakeholders. That's why we often speak about DevOps as having 'skin in the game'. It is expected of business stakeholders to be part of the value stream delivery. Meaning, business stakeholders are equally part of the stream of early and frequent failures. If the business stakeholders were not included in this failing early with agility, they'd never be in the position to give us feedback. As I've mentioned in this article, no one volunteers their feedback; feedback must be solicited. We solicit feedback by failing fast, failing early, and in that way provoking the business to respond with their feedback. And their feedback is what constitutes, in that moment, value.

In reply to by jflory

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.