Teaching software libraries by example

Image by:

Opensource.com

These days there is a software library for nearly every occasion. Many of them are well designed and well implemented. Unfortunately, almost none of them have documentation presented in a way that allows a new user to quickly understand the basics and put it to work effectively.

Many libraries use Doxygen or a similar tool to convert comments in their code into HTML documentation. This goal here is to explain what each function and class is and does. In some cases, this is all the user gets. This is equivalent to saying “This is a hammer. It is used to hit nails. This is a nail, it is used to hold wood together.” and then expecting the user to be able to build a house. This is simply not the right kind of information for the user to be able to learn to use the tool for their needs.

If you’re lucky, the library you’re trying to use will have tutorials. A tutorial typically explains how the basics of a class or part of the library work. Tutorials are great, and should certainly be a part of the whole suite of ways the user can learn a library, but they have two major drawbacks. First, they are extremely time consuming to create. They typically include large paragraphs of descriptive text for every few lines of code, as well as images, diagrams, etc, all of which take significant effort to produce. Second, tutorials only cover a single use case. That is, they demonstrate a very basic usage, and leave the tens of possible variants to the imagination.

What, then, am I suggesting, if documentation and tutorials aren’t enough? Examples! What is the difference between an example and a tutorial? An example is a quick and dirty demonstration of a particular use case of the library. It does not require lengthy text describing each line of code, but rather only a couple of sentences explaining what the example does. The how is left for the reader to look at the code, and refer to related tutorials and documentation. You might think that covering every use case is impossible. Of course you are correct. However, covering enough use cases that the user can figure out what they want to do is actually quite attainable.

The procedure for generating these is quite simple - and actually most of the work is already being done! The key is collecting it and storing it in an easy to find, easy to use format. The ideal process goes as follows. A user runs into a problem they don’t know how to solve. This is the use case that will evolve into an example. The user poses the question to the community, either using a forum, mailing list, IRC channel, or any other means of communication. Rather than simply ask a question (“How do you do XYZ?”) they should instead create a minimal example of what they have tried, along with an explanation of what their code currently does, and how this is different from what they want it to do. At this point, an example is almost born! Even before the problem is solved, the example should be added to the collection of examples in a “wish list” section. Contributors should never answer a question that is not in the wish list. This does two things. First, it ensures the question is posed in a way that will help future users as well as the current question asker. Second, it (almost) ensures the answer will be integrated into the question, leaving a nice, neat, compilable example for everyone in the community to enjoy.

In my experience, doing this helps in several ways. The person asking the question sometimes solves his own problem, as posing the question in this way forces the problem to be simplified into a question about a small, specific part of the library. The person answering the question has most of the work done for them, and can spend less time explaining the solution to the problem. Mailing list traffic is significantly reduced, as for a while (while people are still learning to use/search the examples on their own) most questions can be answered by simply replying with a link to an example. Eventually when people learn to check the examples for themselves, most questions disappear entirely, and all new questions should indicate that an example needs to be added. Finally, the result is an amazing collection of examples that users can use to learn how to do what they need to do. Of course, if they cannot, then it is time for another example!

The beauty of this process is that the work is crowdsourced. And even better, the crowd has a vested interest in participating, because the benefit goes directly to them!

I have applied this technique in several software libraries. The most successful have been the VTK Examples and ITK Examples . In PCL (pointclouds.org), examples are now added directly to the repository. I have created a website dedicated to this effort for other libraries (while the maintainers are being convinced of the idea!) and even programming languages themselves. Several Qt examples, C++ examples and Boost examples are available. The Boost Graph Library (BGL) has been receptive and we are now in the process of moving the BGL examples into the Boost repository.

What are you waiting for? Go out there and start exampling!