Why hypothesis-driven development is key to DevOps

A hypothesis-driven development mindset harvests the core value of feature flags: experimentation in production.

gears and lightbulb to represent innovation

Image by:

Opensource.com

The definition of DevOps, offered by Donovan Brown is "The union of people, process, and products to enable continuous delivery of value to our customers." It accentuates the importance of continuous delivery of value. Let's discuss how experimentation is at the heart of modern development practices.

Reflecting on the past

Before we get into hypothesis-driven development, let's quickly review how we deliver value using waterfall, agile, deployment rings, and feature flags.

In the days of waterfall, we had predictable and process-driven delivery. However, we only delivered value towards the end of the development lifecycle, often failing late as the solution drifted from the original requirements, or our killer features were outdated by the time we finally shipped.

Here, we have one release X and eight features, which are all deployed and exposed to the patiently waiting user. We are continuously delivering value—but with a typical release cadence of six months to two years, the value of the features declines as the world continues to move on. It worked well enough when there was time to plan and a lower expectation to react to more immediate needs.

The introduction of agile allowed us to create and respond to change so we could continuously deliver working software, sense, learn, and respond.

Now, we have three releases: X.1, X.2, and X.3. After the X.1 release, we improved feature 3 based on feedback and re-deployed it in release X.3. This is a simple example of delivering features more often, focused on working software, and responding to user feedback. We are on the path of continuous delivery, focused on our key stakeholders: our users.

Using deployment rings and/or feature flags, we can decouple release deployment and feature exposure, down to the individual user, to control the exposure—the blast radius—of features. We can conduct experiments; progressively expose, test, enable, and hide features; fine-tune releases, and continuously pivot on learnings and feedback.

When we add feature flags to the previous workflow, we can toggle features to be ON (enabled and exposed) or OFF (hidden).

Here, feature flags for features 2, 4, and 8 are OFF, which results in the user being exposed to fewer of the features. All features have been deployed but are not exposed (yet). We can fine-tune the features (value) of each release after deploying to production.

Ring-based deployment limits the impact (blast) on users while we gradually deploy and evaluate one or more features through observation. Rings allow us to deploy features progressively and have multiple releases (v1, v1.1, and v1.2) running in parallel.

Exposing features in the canary and early-adopter rings enables us to evaluate features without the risk of an all-or-nothing big-bang deployment.

Feature flags decouple release deployment and feature exposure. You "flip the flag" to expose a new feature, perform an emergency rollback by resetting the flag, use rules to hide features, and allow users to toggle preview features.

When you combine deployment rings and feature flags, you can progressively deploy a release through rings and use feature flags to fine-tune the deployed release.

See deploying new releases: Feature flags or rings, what's the cost of feature flags, and breaking down walls between people, process, and products for discussions on feature flags, deployment rings, and related topics.

Adding hypothesis-driven development to the mix

Hypothesis-driven development is based on a series of experiments to validate or disprove a hypothesis in a complex problem domain where we have unknown-unknowns. We want to find viable ideas or fail fast. Instead of developing a monolithic solution and performing a big-bang release, we iterate through hypotheses, evaluating how features perform and, most importantly, how and if customers use them.

Template: We believe {customer/business segment} wants {product/feature/service} because {value proposition}.

Example: We believe that users want to be able to select different themes because it will result in improved user satisfaction. We expect 50% or more users to select a non-default theme and to see a 5% increase in user engagement.

Every experiment must be based on a hypothesis, have a measurable conclusion, and contribute to feature and overall product learning. For each experiment, consider these steps:

Observe your user
Define a hypothesis and an experiment to assess the hypothesis
Define clear success criteria (e.g., a 5% increase in user engagement)
Run the experiment
Evaluate the results and either accept or reject the hypothesis
Repeat

Let's have another look at our sample release with eight hypothetical features.

When we deploy each feature, we can observe user behavior and feedback, and prove or disprove the hypothesis that motivated the deployment. As you can see, the experiment fails for features 2 and 6, allowing us to fail-fast and remove them from the solution. We do not want to carry waste that is not delivering value or delighting our users! The experiment for feature 3 is inconclusive, so we adapt the feature, repeat the experiment, and perform A/B testing in Release X.2. Based on observations, we identify the variant feature 3.2 as the winner and re-deploy in release X.3. We only expose the features that passed the experiment and satisfy the users.

Hypothesis-driven development lights up progressive exposure

When we combine hypothesis-driven development with progressive exposure strategies, we can vertically slice our solution, incrementally delivering on our long-term vision. With each slice, we progressively expose experiments, enable features that delight our users and hide those that did not make the cut.

But there is more. When we embrace hypothesis-driven development, we can learn how technology works together, or not, and what our customers need and want. We also complement the test-driven development (TDD) principle. TDD encourages us to write the test first (hypothesis), then confirm our features are correct (experiment), and succeed or fail the test (evaluate). It is all about quality and delighting our users, as outlined in principles 1, 3, and 7 of the Agile Manifesto:

Our highest priority is to satisfy the customers through early and continuous delivery of value.
Deliver software often, from a couple of weeks to a couple of months, with a preference to the shorter timescale.
Working software is the primary measure of progress.

More importantly, we introduce a new mindset that breaks down the walls between development, business, and operations to view, design, develop, deliver, and observe our solution in an iterative series of experiments, adopting features based on scientific analysis, user behavior, and feedback in production. We can evolve our solutions in thin slices through observation and learning in production, a luxury that other engineering disciplines, such as aerospace or civil engineering, can only dream of.

The good news is that hypothesis-driven development supports the empirical process theory and its three pillars: Transparency, Inspection, and Adaption.

But there is more. Based on lean principles, we must pivot or persevere after we measure and inspect the feedback. Using feature toggles in conjunction with hypothesis-driven development, we get the best of both worlds, as well as the ability to use A|B testing to make decisions on feedback, such as likes/dislikes and value/waste.

Remember:

Hypothesis-driven development:

Is about a series of experiments to confirm or disprove a hypothesis. Identify value!
Delivers a measurable conclusion and enables continued learning.
Enables continuous feedback from the key stakeholder—the user—to understand the unknown-unknowns!
Enables us to understand the evolving landscape into which we progressively expose value.

Progressive exposure:

Is not an excuse to hide non-production-ready code. Always ship quality!
Is about deploying a release of features through rings in production. Limit blast radius!
Is about enabling or disabling features in production. Fine-tune release values!
Relies on circuit breakers to protect the infrastructure from implications of progressive exposure. Observe, sense, act!

What have you learned about progressive exposure strategies and hypothesis-driven development? We look forward to your candid feedback.