Teaching software libraries by example

No readers like this yet.
An open card catalog

Opensource.com

These days there is a software library for nearly every occasion. Many of them are well designed and well implemented. Unfortunately, almost none of them have documentation presented in a way that allows a new user to quickly understand the basics and put it to work effectively.

Many libraries use Doxygen or a similar tool to convert comments in their code into HTML documentation. This goal here is to explain what each function and class is and does. In some cases, this is all the user gets. This is equivalent to saying “This is a hammer. It is used to hit nails. This is a nail, it is used to hold wood together.” and then expecting the user to be able to build a house. This is simply not the right kind of information for the user to be able to learn to use the tool for their needs.

If you’re lucky, the library you’re trying to use will have tutorials. A tutorial typically explains how the basics of a class or part of the library work. Tutorials are great, and should certainly be a part of the whole suite of ways the user can learn a library, but they have two major drawbacks. First, they are extremely time consuming to create. They typically include large paragraphs of descriptive text for every few lines of code, as well as images, diagrams, etc, all of which take significant effort to produce. Second, tutorials only cover a single use case. That is, they demonstrate a very basic usage, and leave the tens of possible variants to the imagination.

What, then, am I suggesting, if documentation and tutorials aren’t enough? Examples! What is the difference between an example and a tutorial? An example is a quick and dirty demonstration of a particular use case of the library. It does not require lengthy text describing each line of code, but rather only a couple of sentences explaining what the example does. The how is left for the reader to look at the code, and refer to related tutorials and documentation. You might think that covering every use case is impossible. Of course you are correct. However, covering enough use cases that the user can figure out what they want to do is actually quite attainable.

The procedure for generating these is quite simple - and actually most of the work is already being done! The key is collecting it and storing it in an easy to find, easy to use format. The ideal process goes as follows. A user runs into a problem they don’t know how to solve. This is the use case that will evolve into an example. The user poses the question to the community, either using a forum, mailing list, IRC channel, or any other means of communication. Rather than simply ask a question (“How do you do XYZ?”) they should instead create a minimal example of what they have tried, along with an explanation of what their code currently does, and how this is different from what they want it to do. At this point, an example is almost born! Even before the problem is solved, the example should be added to the collection of examples in a “wish list” section. Contributors should never answer a question that is not in the wish list. This does two things. First, it ensures the question is posed in a way that will help future users as well as the current question asker. Second, it (almost) ensures the answer will be integrated into the question, leaving a nice, neat, compilable example for everyone in the community to enjoy.

In my experience, doing this helps in several ways. The person asking the question sometimes solves his own problem, as posing the question in this way forces the problem to be simplified into a question about a small, specific part of the library. The person answering the question has most of the work done for them, and can spend less time explaining the solution to the problem. Mailing list traffic is significantly reduced, as for a while (while people are still learning to use/search the examples on their own) most questions can be answered by simply replying with a link to an example. Eventually when people learn to check the examples for themselves, most questions disappear entirely, and all new questions should indicate that an example needs to be added. Finally, the result is an amazing collection of examples that users can use to learn how to do what they need to do. Of course, if they cannot, then it is time for another example!

The beauty of this process is that the work is crowdsourced. And even better, the crowd has a vested interest in participating, because the benefit goes directly to them!

I have applied this technique in several software libraries. The most successful have been the VTK Examples and ITK Examples . In PCL (pointclouds.org), examples are now added directly to the repository. I have created a website dedicated to this effort for other libraries (while the maintainers are being convinced of the idea!) and even programming languages themselves. Several Qt examples, C++ examples and Boost examples are available. The Boost Graph Library (BGL) has been receptive and we are now in the process of moving the BGL examples into the Boost repository.

What are you waiting for? Go out there and start exampling!

Tags
User profile image.
I am currently working on a Ph.D. in Electrical Engineering at Rensselaer Polytechnic Institute. I work in the field of computer vision and image processing. My research deals with 3D data analysis, particularly from LiDAR scanners. I have benefited tremendously from the practices of open source and strive to continue to do my part to continue the give-and-take cycle!

9 Comments

David,
good idea ... but why not turn these Examples in Test Cases?
You certainly know how the testing aspect is critical, and especially how it's not so well applied for open source products. I see, in what you are proposing, a possible way to achieve a testing approach that's effectively oriented to the user needs.
Thank you, I'll try to put in place your approach.

cheers,
Davide

Davide,

Tests must be much more complicated than examples. Examples should cover the "happy path" (http://en.wikipedia.org/wiki/Happy_path) to demonstrate how the code is supposed to work. Tests, on the other hand, must cover all paths. This means you'll setup cases to intentionally cause exceptions, set every possible parameter, etc.

It is partially because of this that I've nearly never seen a Test that can be used as an example. Unfortunately, most developers don't see this enormous distinction. I will see questions on a mailing list that are answered by "look at the test". When you look at the test, you see a 2000 line mess of a piece of code that potentially has buried inside of it what you are looking for. The concept of an example is exactly extracting that piece of demonstration into a stand alone, compilable piece of code.

Good luck!

I agree ... when I posted my comment I had in mind all those projects that are not used to test so much. They can benefit a lot from the contribution of end-user-driven examples.

ciao,
Davide

If you link examples into the Doxygen configuration for your project, your documentation will even end up with nice links to and from the reference docs for the applicable class(es).

We did that for Qt Cryptographic Architecture (QCA). It has the per-method reference stuff, the example applications, and some high level explanation.

Tests aren't examples of how to apply the code. They need to cover all kinds of strange inputs (making sure that something non-destructive happens if the user of the API gets it wrong) and that obscures the "right way" to use the API.

Brad,

That is a really great practice! After a quick Google search I didn't find the examples for QCA - can you post a link?

We have done this same cross-linking in the ITK project (see http://www.itk.org/Doxygen/html/classitk_1_1AddImageFilter.html for an example). There is a "Wiki Examples" section. We also link the other way - from the examples to the Doxygen: http://www.itk.org/Wiki/ITK/Examples#Trigonometric_Filters (the ITK Classes Demonstrated column).

We have a full system specifically for Example display and editing in the works. It is much more tailored to the job than a standard wiki. Stay tuned... !

The overview page is at:
http://delta.affinix.com/docs/qca/

There examples are linked off a tab at the top of that page, but one direct example is at:
http://delta.affinix.com/docs/qca/aes-cmac_8cpp-example.html

Note that this linking doesn't require any real effort in doxygen. All you do is tell doxygen where the examples are, and write a little introduction for the options, which looks like:
http://websvn.kde.org/trunk/kdesupport/qca/examples/examples.doco?view=markup

An important aspect to this approach is that you can keep the example code in the main build system and make sure it keeps building. That is, the compiler will help to ensure that every example really is OK for the current state of the API. Its annoying when the examples don't work....

Very nice! As you mentioned, compiling the examples routinely is certainly critical.

The only design goal I've had that this technique does not meet is the ability for an average user to be able to add, update, or fix an example. Since these examples reside in the main repository, I'm assuming that only "developers" (those who have earned commit access) can touch them. I suppose there could be an argument for "only developers, who have demonstrated that they know best practices, etc, should be writing examples anyway." However, in my experience the people who write the most examples are "tweeners" - those who are very skilled at using the library, but not necessary interested in development, and therefore do not have commit access. By truly "crowdsourcing" the examples, it puts the burden more on the whole community and lets the developers keep watch without having to dedicate a lot of effort to writing examples.

I'm really glad to see other projects taking this seriously, thanks for the links Brad!

I don't know if it's unique to the Python community, but the prevalence of the Sphinx documentation tool (which has more in common with LaTeX than Doxygen) seems to have pushed things in the other direction.

There will be a few tutorials, maybe even examples if you're really lucky... but few (if any) usable API docs. I often find myself wishing, more than anything, for an alternative to reading the source for insight into some file, function, class, or method that didn't even get one of the explicit "let the autodoc handle this" directives.

I wish this advice was given to man page writers some 20 or so years ago...

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License.