I can't bake croissants: a fable on project documentation

No readers like this yet.
flour + butter + stuff

Opensource.com

Hi! I'm Mel. When I'm not doing Free Software and Open Source stuff, I'm a learning psychology geek. One of the questions I get asked a lot by fellow FOSS hackers is: Mel! Why don't people help me with my project?

Based on a CC-BY-3.0 photo by assbach

Based on a CC-BY-3.0 photo by assbach

Before I can respond, they quickly say:

  1. But I have documentation!
  2. And I don't bite on IRC!
  3. Really!

So I look at the documentation, watch them interact with folks on IRC, and within moments, I can answer: "All right, I see what your problem is."

Now, instead of explaining this for software, I'm a little hungry, so I'm going to talk about croissants.

Croissant by Alanna Risse

Croissant CC-BY-2.0 by Alanna Risse

I'm learning how to bake. I'm a terrible baker. I touch an oven and something blows up. So I'm very much a novice in the baking world. Let's say I want to learn how to bake bread, so I look online for some bread recipes. I might come across a recipe that looks like this:

Croissants

  • flour
  • butter
  • stuff

Mix then bake it!

Ask me if you have questions!

And I promptly go "bwuuuuh?" and ignore everything—I don't know where to start. I'm not even sure what you're talking about and what I want are the same thing. I mean, what the heck is a croissant?

It might help if you let me know that it's a type of bread.

croissant-madrid CC-BY-3.0 by Tamorlan

croissant-madrid CC-BY-3.0 by Tamorlan

Croissants: tasty flaky buttery bread

  • flour
  • butter
  • stuff

Mix then bake it!

Ask me if you have questions!

Right! Bread! The thing I'm trying to learn to bake! I might actually want to make this stuff now. I'm still not sure exactly how I do that, though.

What's that list of stuff? How do I bake it? Nobody told me, so I went out and got flour, butter, and... some stuff  - lemon drops and marshmallows are stuff, right? So I put it in a bowl to mix it and now I...

Oh, shoot. You're telling me I needed to get an oven beforehand? And preheat it? What does that even mean? Have I done something terribly wrong?

Should I give up and go away?

based on cornetti1 CC-BY-SA-3.0 by exeair

based on cornetti1 CC-BY-SA-3.0 by exeair

In contrast, look at this WikiHow guide to croissant-making. There's a clear ingredients list: 4 cups flour, 1 Tablespoon active dry yeast (2 packets), and so on. There are steps. They're illustrated. They tell me what the dough should look like ("smooth elastic consistency") and how long each step will take ("until it doubles in size... should take 1.5 to 2 hours.") Heck, step 20 even tells me to turn the oven on so it's preheated and ready to go in step 22, and there's a list of things that go well with croissants (butter, jam, ham, cheese) at the end. And there's a picture of the end result that makes my mouth water.

We know these clear step-by-step instructions are important in a cookbook because we're mostly novices in baking-land. We can't just improvise through because we don't know how these ingredients are going to interact with each other; we aren't familiar with pastry-making, so even common tools and techniques will be unfamiliar. We've never made (and maybe never even eaten) croissants before so we can't visualize the process or the end result without help.

The Mel approves

We don't have context.

And yet we think that instructions like this should be understandable by all human beings:

Download these tarballs, compile them, and everything should work.

They're understandable to us as experienced software hackers – most of the time when we write documentation like that, we're merely providing an example of something we've done ourselves dozens of times in testing. We're experts. We've done it all before, we know what to expect, and rough notes are sufficient for us to reproduce past results.

But our audience is not. Context is important. Experience is important. And you can never assume that your audience—those who follow behind you—will have had either, and especially not in the exact same ways that you have had.

So if you're wondering why people don't follow your instructions to help you with your project, hit your local library and check out a cookbook. Bake something you've never baked before. (If you're a good baker, try to prepare a vegan dessert or a gluten-free bread - do something unfamiliar.) Notice what it feels like to be a novice. Notice which instructions make you feel nervous and which make you feel confident. Then, while eating the fruits of your labors, open your documentation again and take a look at it with your "beginner's mind."

If you'd like to learn more about the differences betwen novice and expert thought in any given domain, check out the Dreyfus Model of Skill Acquisition, which is secretly what this entire bit on croissants has been about.

That's all the time we have for today. Thank you.

This article was originally a 5-minute lightning talk delivered at FUDCon Tempe.

Tags
User profile image.
Mel Chua is a contagiously enthusiastic hacker, writer, and educator with over a decade of teaching and curriculum development experience and a solid track record in leadership positions at Red Hat, One Laptop Per Child, Sugar Labs, Fedora, and other Free, Libre, and Open Source Software (FLOSS) communities.

10 Comments

As a program manager, I find I spend lots of time trying to explain to the engineers working on my programs why documentation that they "just get" isn't good enough for the documentation we release with our products. This is a great example and one that I will be sharing with them. Hopefully this time it's the engineers that "just get" it. :)

I have been impelenting a feature in our intranet that a co-worker developed and wrote up the documentation for (he admits it was thrown together and not meant for consumption).

It has not been easy to understand it all in part because there were so many assumptions he knew from building it initially, and how it was supposed to work. Neither did he put down the specific names of some of the variables and how they should be used.

I've hodge-podged my way through it and now that I have a grasp of where *I* fell short, I am trying to fill in the gaps of the documentation so the next time (and for the next person) the necessary tools will be spelled out.

Whenever I am writing documentation for anything (articles, tutorials, etc.) I keep one person in mind; my wife. I know her very well, and I know she is not into computers enough so I have to describe things very carefully (and visually). Sometimes as I look over what I am working on, I can hear her voice in my head and I listen to it. I use her as an image of who I am writing for to keep from taking anything for granted.

It doesn't always work, but it helps!

The thing I like about your example is that you have a specific individual in mind - your wife - rather than a broad class of "developer," which is a vague catch-all that's difficult to test. How do you check whether documentation for your wife works? You show it to your wife. How do you check whether documentation for "a developer" works? Ummm....

Better to write for a specific developer - or to have that developer write it for you, as you've pointed out. Grab an intern, say you're going to teach them to do something cool, and they're going to write it up and have a piece of tech writing in their name they can get full credit for, and you'll help edit. Effectively, they pay for teaching in documentation, which is a great strategy to draw new people into a project; a "beginner's mind" can be a powerful tool if it comes with the articulateness to describe it.

Finally, in describing how you fill in documentation gaps you discover, you've perfectly illustrated the "leave things better than you found them" rules that keep projects - and living rooms - in sane shape.

But yeah, it's important to write for specific people (and get that writing tested - even briefly - but those specific people). It's why professional designers use personas instead of "market segments." I've written for my 13-year-old cousin, for my boyfriend, for my boss, the interns on my team, for developers I've admired but am too shy to ask questions of without offering something of value in return ("if you teach me how to do this, I'll write instructions from it so you'll never have to teach anyone again!"). The key, for me, is actually having those people check out the docs once I've written them. And if you haven't had your documentation ripped apart by a 13-year-old girl... it's a humbling experience.

What most programmers do realize and is not taught in schools, is that there are two types of documentation: one for the user, and one for the code maintainers. Since they served different purposes, the must be done separately. The one for the users should be written by your technical writers, with input from your programmers, but the one for the maintainers should be written solely by your programmers. And too often, the coding standard asks for a description, which results in the code being repeated in bad English. What should be the documentation is the purpose of the code; why is it done this way and not some other. That will tell the maintainers what might happen if they make significant changes to code. Two type of documentation to serve two different purposes.

Good point - I should have made that distinction. The original talk was to an audience of developers (I'm an engineer myself), so I had developer documentation in mind; one of my pet peeves in any technical project is how few engineers get their documentation tested by other engineers, then wonder why nobody can interface with or extend their component.

Documenting the purpose of the code and the rationale behind decisions made is also an excellent point. I've seen a lot of young devs join a project and go "Why did they use $technology_foo? They must be idiots; why not $technology_bar - I can rewrite this with $technology_bar in 2 hours!" (or something of the sort), then talk with older devs and learn that the product was written when $technology_bar was unstable, or the customer it was made for insisted on $technology_foo, then demonstrate that $technology_bar crashes after a dozen users and doesn't port to different browsers and is nonextensible, and so forth.

Version control and keeping meeting logs where these sorts of debates and decisions are documented are good ways to help keep track of this, because no, your logic and decision process isn't magically embedded in your code when you check it in.

So what motivates someone to provide clear context in their documentation? I think you have to have a certain sense of love or appreciation for your customers -- and want them to appreciate that you helped them. If that "love" is not natural, then you probably need a feedback loop.

Remember the funny child's game where one kid (blindfolded) gives the step by step instructions for something simple (like tying a knot, making a PB&J sandwich, or setting a table) and the other kids have to follow the steps EXACTLY, without filling in the gaps -- and then seeing the chaos that ensues. Then the blindfolded "instructor" sees the ridiculous results. I wish tarballs worked that way.

My main motivation, personally, is the hope that someday I won't maintain <em>any</em> of the code I've written. I get bored easily - once I've finished a project sufficiently to scratch my own itch, I immediately check whether it scratches anyone else's itch... and if it does, I try to get them to make it scratch their itch <em>better</em>... and if that works, gradually other people end up doing more (and better - I'm actually not a good coder at all) development on it than I am.

At the point where they outpace me and I'm actually a bottleneck and hindrance to their efforts, I ask if they'd like to take over maintenance so I can get out of the way of their productivity. So far, everyone's said yes. And I get to keep using the software I originated. Win-win. But it depends - crucially - on having good developer docs. So I typically write these from the chat logs of the first new developer I (manually) walk through hacking on my projects, because (by definition) they contain exactly the questions needed to get a new person started.

Then I'm free to move on to my next awesome project.

This makes great points regarding how easy it is to overlook simple steps.

I got a good feel for this when participating in QA, where missing any step results in a "Resolved: Cannot reproduce" update, and scorn from the developer(s). After careful review, you realize that between step 3 and 4, there should have been a note about using the Enter key instead of clicking the Login button (or some other variable action).

The key in QA is to make sure that a computer could replicate the issue following your steps (computers being much more literal than your spouse, child, cousin, non-technical relative, etc.). I always document bugs assuming that a computer will be trying to replicate the issue. That means consistency in object/detail reference, verbs, quotation, case, etc. Just like in a recipe, where someone may get confused or start second-guessing themselves if you switch from "tsp." to "teaspoon" repeatedly.

There is a little bit of room in end user documentation, as placeholders will often offer comfort to users. That is, merely identifying that a section is missing, or more details are needed in your docs, can make a huge difference. It is the difference between someone getting lost and giving up, and that same person realizing that they are lost in an area that is declared as confusing. This simple acknowledgment could lead to a user investigating further (or even contacting you via email/irc/whatever), instead of just finding a different project.

As alluded by Mel, and several of the comments, the problem is assuming that everyone else carries the same experience/perspective/assumptions that you do. I am guilty of it too, and it is time to stop complaining about writing the documentation. Even though it's a pain, just start updating it so that anyone can understand what you are trying to say. It will save everyone a lot of time in the long run.

And for those of us who want to participate in opensource - find the documentation that's confusing, and contact the project participants about fixing it! Don't give up, or just complain, it's the perfect avenue to start participation, and you get to learn a lot about the project along the way.

The name of the concept is <a href="http://wiki.lesswrong.com/wiki/Inferential_distance">inferential distance</a> - the assumed knowledge your statements rest on, the things someone needs to know before they understand what your sentence means.

Yes, every geek needs to understand this concept :-) Geeks have a tendency to <a href="http://lesswrong.com/lw/kg/expecting_short_inferential_distances/">expect short inferential distances</a> and suffer the <a href="http://wiki.lesswrong.com/wiki/Illusion_of_transparency">illusion of transparency</a> - the misleading impression that your words convey more to others than they actually do.

Communication remains a <i>hard</i> problem, even as it's one our brains specifically evolved to handle.

Let me add my 2.0e-2 cents.
Speaking on a very, very general basis (your mileage may vary) I find that what I miss more in some OS projects is a very high-level view of the structure of the software. Often you find in the /doc directory a group of automatically generated *.html files with the inheritance diagrams of the classes and prototypes of the functions. This is <em>too detailed</em> documentation. If you are an experienced developer, it can be useful to have something that reminds you that Audio_Bugaboo inherits from Multimedia_Bugaboo that in turns inherit from Abstract_Bugaboo, but for a newbie who does not know _what_ a Bugaboo is and what it should do, it can be more useful a rough block diagram that shows where a Bugaboo enters in the picture.

Another place where I would like to see some documentation (and where I usually put it) is in the *.h files (well, actually in *.ads files. I use Ada...). From my point of view, a developer is a _user_ of the small module that your .c file implements and the best place where to put the instructions for using the module is the .h file. However, often what you find inside, say, multibugaboo.h is a one-line comment similar to

<code>
/* multibugaboo.h: Multibugaboo implementation */
</code>

Duh... Really?!? That is a surprise...
What in the $place is a Multibugaboo?!? What is its duty?!!? What kind of function/procedure/data structures this module provides?!? By the way, it is not necessary a detailed description of the internal details of a multibugaboo (if you really need, put them in the .c file), just a very high-level description would suffice, so that if I am searching for the code that is formatting the packets and send them over the network, I know if I am in the right file or not.

[[ Sorry for the rant... Imagine lots of smiles next to it :-) :-) :-) I am smiling while I write, but the electronic mean do not carry this kind of side-information. ]]

So, bottom line, high-level documentation is what a newbie often needs. Paradoxically, documenting the finest details is not maybe so important, they can be deduced from a nicely written code. Moreover, it could give rise to problems if the implementation details change, but the comments do not.

Let me conclude by telling you my personal approach to this problem. In order to practice what I preach, I found it useful to keep a wiki (edited only by myself) to be used as a notepad for ideas. The wiki gives you an easy way to keep your notes well-ordered and it is sufficiently fast to not mess with the flow of your ideas. At the end, the wiki, suitably refined, should be a good basis for the high-level overview. Also, scanned hand-drawn pictures make good place-holder for diagrams & stuff to be produced later with some drawing software. Of course, your mileage may vary...

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License.