Nicholas Folk and Scott Buchanan co-authored this article.
We are three students in the Bachelor of Computer Science second degree program at the University of British Columbia (UBC). As we each have cooperative education experience, our technical ability and contributions have increasingly become a point of focus as we approach graduation. Our past couple of years at UBC have allowed us to produce some great technical content, but we all found ourselves with one component noticeably absent from our resumes: an open source contribution. While the reasons for this are varied, they all stem from the fact that making a contribution involves a set of skills that goes far beyond anything taught in the classroom or even learned during an internship. It requires a person to be outgoing with complete strangers, to be proactive in seeking out problems to solve, and to have effective written communication.
While these skills were not entirely foreign to us, we knew we were still developing them. As a result, we were hesitant about jumping into an open source project. It was not until we discovered a directed studies course aimed at getting students involved in open source development that we were motivated to jump in.
Specifically, the course is structured around contributing to a humanitarian free open source software (HFOSS) project and interacting with the people and processes that make it work. Below is a brief introduction of the two projects in which we were involved followed by some questions and answers we put together to capture our reflections on HFOSS development as first-time contributors. Our aim is to present insight to budding developers who might be nervous to take the open source plunge, to more experienced developers who might not have any involvement with open source development, and possibly to experienced developers that do work on open source projects and want to know what new developers experience when joining their respective teams.
Project 1: Sahana Eden
Sahana Eden is an open source humanitarian platform that can be used to provide solutions for disaster management, development, and environmental management sectors. It is one of the projects managed by the Sahana Software Foundation and has seen critical involvement in a number of disaster relief deployments over the past few years. Scott and Matt are part of this project.
Project 2: OpenMRS
OpenMRS is a medical records database aimed at creating easier access to patient data for healthcare workers in developing nations. The project began in 2004 and has since grown to be used around the globe showcasing its necessity in the fight against pandemics such as HIV/AIDS and drug resistant Tuberculosis, as well as its usefulness in basic primary care and oncology. The primary goal of the OpenMRS project is to create an easy-to-use and extensible medical record system that users can modify to suit their own needs. Currently, many medical records in developing nations are still kept either in physical files or spreadsheets stored on local drives. OpenMRS strives to create a single database on a remote server creating faster, more reliable access to patient information across a larger geographic area. Nicholas is part of this project.
Scott Buchanan (SB): First contact with the Sahana Eden community was a simple process. They have an active mailing list and an IRC channel to facilitate communication within the community. Through the mailing list, Matt and I introduced ourselves and our goals for working with the community. We received a few positive responses and getting involved seemed well underway.
Nicholas Folk (NF): OpenMRS provides a myriad of tools to facilitate communication with the community. Like the Sahana Eden community, OpenMRS has a mailing list and an IRC channel, but also a discussion forum where new developers are encouraged to introduce themselves and ask questions. There is also a set of live discussions that developers can join via online video conferencing aimed at discussing new ideas and encouraging newbie developers to get up to speed on various aspects of the project. Most of these discussions were difficult for me to join due to the difference in time zones, but luckily the organizers record and upload many of the screencasts to their YouTube channel.
While I did introduce myself to the community on the discussion forum, I received more help and direction when I messaged a handful of lead developers directly. They pointed me to new developer guides and specific sub-projects to which I might be interested in contributing. Overall, I was impressed with the welcoming attitude and assistance that everyone was willing to give to a newcomer.
Setting up a framework
Matthew MacLennan (MM): The process was straightforward. The Sahana Eden website has a well documented section for setting up a work environment including instructions for configuring libraries on your machine, setting up version control, and getting a test instance up and running. Conveniently, there was a specific guide for my current operating system, GNU/Linux. I can't comment on how setting up a development environment would be for other operating systems. One resource that stood out for me was amount of test data available to pre-populate test instances on your local machine. By having these available, it was simple to see how different deployments of Eden looked and worked.
SB: One of the concerns I had prior to the class was the difficulty that might be involved in the dev environment set up and the lack of any in-person help. I have had some internship experiences where it has taken me weeks to properly install my environment and get all the necessary permissions. Fortunately, the set up with Sahana Eden was really, really simple. I run OSX, so I had to do everything through a virtual machine, but it was straightforward and well documented. If I recall correctly, I had everything up and running with a populated database inside of a few hours.
NF: Due to extensive documentation on the OpenMRS Wiki, as well as their comprehensive new developer's guide, setting up a development framework was mostly straightforward. I did experience minor frustrations, such as installation errors and version compatibility issues, but one positive aspect about encountering these roadblocks was the warm help I received from the community. There was one afternoon where a lead developer took a few hours out of his day to help me (via the IRC channel) work through a database installation error. The error was reproducible and unrecorded, so I wrote about the problem and the specific steps to reproduce and solve the issue (once I had done so myself) in the discussion forums. I received warm feedback from other developers and inspired another lead developer to update a good chunk of the setup wiki. Everyone was willing to devote time to help someone they had never even met, which really demonstrated a positive atmosphere that I felt encouraged to join.
With OpenMRS supporting dozens of open source modules, it is not surprising that compatibility becomes an issue at scale. A tighter (perhaps quarterly) review of the setup process by senior developers would help the community tremendously. Aside from that, however, new developers should not have too much trouble with the setup process.
MM: My initial involvement entailed following the instructions the Sahana Eden project had for new contributors. It suggested finding small bugs in the bug report database or finding to-dos for small features in the code base. In the bug report database most bugs were a couple of years old, leaving not many options to choose from. I decided to explore a few of the most recent report bugs. Unfortunately, most of the bugs were not replicable. It could have been due to the code base having changed since the bug report was written or the bug reporter not having a correct instance of the project up and running. I made sure to comment on the bugs I was unable to replicate to help others avoid spending their time on them.
Eventually, I found a bug that I could replicate. I figured out the various options for a fix, and conveyed them to the mailing list to see which the community thought was most suitable. After receiving some feedback, I fixed the bug. But after further discussion with a senior community member, it was determined I was fixing a bug for something that unbeknownst to me was currently deprecated. As a result, my fix for that bug report was not merged into the code base, but I still found it a good learning experience for getting my feet wet.
NF: OpenMRS provides a wall of introductory tickets on the front page of their customized JIRA issue tracker designed for newcomers to get more comfortable with the project workflow. This made it easy to find issues where I felt knowledgeable enough to perform the task and as a bonus, to get more familiar with the codebase. I chose a task that entailed adding a deletion functionality to the front-end system administration management tools. Adding the functionality was straightforward, but unit testing was a bit more challenging due to my unfamiliarity with the particular libraries and frameworks like Spring and Hibernate.
The introductory tickets made the onboarding process easy and engaging.
Focus and features
MM: After my initial experience, I wanted to find a part of the Eden project that was a more current focus for members in the community. I hoped that if I found something others were also working on, it would likely be a more relevant feature to contribute to. After a couple days of reading the mailing list and development portal, I discovered a mobile application, EdenMobile, that was actively being worked on. Another developer had started this project in hopes of creating a cross-platform mobile application. The developer was approaching the development with recognizable design principles that I learned in school that I felt would offer a richer process of development. There was also the added bonus of the mobile application being a somewhat self-contained code base which would mean not having to master the large web framework that is Eden.
Scott and I decided that we would tackle one of the development goals for EdenMobile: SMS communication. The use case was simple: an individual doesn't have a data/WiFi connection and as a result, sends an update of data to the central server using a series of SMS messages. Unfortunately, the original developer became unavailable during a time we needed to make progress for our course. As a result, Scott and I were not entirely sure how the lead wanted us to design and incorporate the SMS communication features into the main mobile application. Consequently, Scott and I decided to create a library that could be integrated at a later date by the original developer.
SB: Matt and I had a bunch of discussions about what aspect of Eden we would get involved with. As Matt mentioned, a big concern for us was the scale of Eden and the amount that we would need to learn in order to even begin thinking about how we could contribute. Ultimately, we wanted to work on something that had very defined exit and entry points and that had a limited scope so that we could get involved with little to no ramping up time. Matt eventually found the the Eden Mobile repo and that seemed like a great fit. There were documented to-dos and we gravitated towards the networking-based SMS task as we had just recently completed a number of courses surrounding the topic of distributed systems. This ended up being a great choice in terms of difficulty level and opportunity to learn as we were able to apply a lot of what we had learned in the classroom to a real-life scenario.
NF: I cautiously anticipated projects falling through, delays in response time, and other obstacles that would prevent me from making substantial contributions to any one particular project. I decided it would be wise to attempt to work on three different projects in case one had too steep of a learning curve or (in one case) if a budding module never got off the ground. Also, while it is reasonable to expect team members to take a day or two to respond to questions, the time constraints of my course meant that I would be better off trying to multitask while I waited for such responses.
Like most open source projects, almost all of the OpenMRS modules are on GitHub, and I was able to look at each module individually. I based my selection off of criteria such as how active the module was, how communicative the developers were (both with each other and to new contributors to the module), and how interested I'd be in working on it. GitHub makes the first of these questions quite easy to answer. Not only does GitHub explicitly show the commit history of each project, it also provides useful graphs and diagrams of the contributions when comparing different modules. For the second question, I used the JIRA issues comment sections to gauge how attentive the developers were to each other's communications, as well as their response times to my emails and messages. These first two criteria really narrowed down my scope to a small handful of modules. I selected the Chart Search Module, an Ophthalmology Module that had not begun development yet, and the Reference Application.
I am happy with my diversification approach, as the Ophthalmology Module has yet to gain any traction and I experienced some miscommunications with the Reference Application team (I began work on a number issues recommended to me by some of the team members that were deprecated or already fixed but not marked as such). Luckily, I still had the Chart Search Module to fall back on with the lead developer of the project being more than helpful, fast to respond to my messages, and very knowledgeable of the module itself. I have been helping him refactor parts of the user interface and will soon be adding an extension to the search feature.
In addition to the basic development framework, I spend a lot of time researching SMS. Scott and I sought to develop a protocol that would facilitate the exchange of data between mobile devices and the server. It was an interesting challenge to try and design a failure-tolerant messaging protocol, and the current state of our product definitely requires more testing before it could go to a production environment.
SB: The main technical takeaway for me was my involvement with Twilio. When we started this project I knew that at some point we would need to convert an SMS message to a regular HTTP call, but I had absolutely no idea how that would be possible. Eventually someone linked us to Twilio and I started to explore all of the services that they provide. This sounds like a bit of an advertisement for them, but they have an incredibly easy to work with and well documented product. As soon as I realized everything that might be possible by using their technology it, immediately sparked a few ideas for personal projects I might take on down the road.
NF: While I have a decent amount of experience with Java, I had never used it for web application development. Because of this, some of the tools that I gained familiarity with are the Spring MVC framework and JavaServer Pages (JSP), as well as Maven for automating builds. I was already somewhat familiar with Git through the EGit plugin in Eclipse, but the projects I had worked on previously were not hosted on GitHub. I gained a much stronger intuition with Git on the command line and a better understanding of the fork-and-pull workflow inherent to GitHub.
MM: My main takeaway from this experience is that contributing to an open source project is a lot harder than one might think. This is because the peers you work with often have other priorities or responsibilities that come first. This makes it difficult to acquire help and feedback while trying to implement new features, especially if only one or two community members are well versed on that particular topic. On the other hand, I am sure that people who spend a substantial amount of time working on a particular project are efficient when contributing because of experience with a particular project.
NF: While I cannot speak on behalf of all open source projects, OpenMRS has a wonderfully supportive community that welcomes and encourages contributors. I was nervous to join other developers running at full speed, but it was a pleasant surprise to receive such a positive vibe from the team. Open source development is more of a commitment than I first thought it would be. When I initially learned about the concept of open source software, I had the impression that forking a project and making a contribution could be done in a day or even a few hours. I know now it would be unrealistic to do so for projects of a significant size. Setup can take a long time, and it would be challenging to find a project where the developer has experience with the full technology stack, so there will likely be a learning curve. My experience this summer has given me a better intuition for project planning and time management. Not only that, I have also gained the confidence to join new projects and work more independently, while knowing when to collaborate with the community.
What impresses me is that despite some of the bumps in the road, open source works. OpenMRS has successfully been improving since its beginnings over 10 years ago and it's all due to the work of hundreds of developers volunteering their time. The software is deployed all over the globe and has had a significant impact fighting epidemics.
Versus a paid gig
MM: Much of the actual work process was the same minus daily scrum and check-ins with coworkers. With Eden, I still had to take ownership of features or bugs, investigate them myself, and provide fixes just like I did in a paid work position. One difference with Eden was my flexibility to move between the different features within the project. When I decided I didn't want to continue along with my initial bug fixes, I was free to find another feature to work on. When I worked in professional settings, I had less freedom to drop my work on one feature and move on to another due to deadlines and constraints.
NF: The biggest disparity is during the early stages of joining the project. As a new developer, it is immensely helpful to have someone sitting at the desk beside me available to answer questions. On the other hand, without that level of hand holding, I learned more effective ways of researching solutions to my problems independently. My technical written communication sharpened as I needed to be precise and thorough when sharing my problems with the community (not to mention, figuring out which issues needed to go through the community and which issues I needed to solve on my own). Once I had my environment set up and I had begun to work on the Chart Search Module with a reliably communicative team member, it felt a lot more similar to my work in a paid position.
One thing I liked about working on an open source project is the level of autonomy I had in choosing which modules to work on. It can be a lot more challenging to switch teams in a structured company than in the open source world, where you can simply fork a new module, assign yourself to a task associated with that module, and make a pull request. Though the team may not have prioritized issues you would like to work on, they are more than happy that you have volunteered your time to help anyway.
Another aspect where I noticed a difference was quality control. At my paid position, almost all substantial project commits were subject to lengthy code reviews, cross-team validation, and QA testing. For instance, most of my commits at my paid position took anywhere from five to 20 iterations of code review before they were even sent to the QA team. Admittedly, quality control was a high priority at the company where I interned, so this may not reflect the industry standard. In comparison, when I committed to OpenMRS my changes were merged after my first pull request. This is not to say that there is no code review at OpenMRS (indeed, I have been asked to make changes on a pull request), nor is it to say that it is necessarily better to have overly stringent quality checks. You might get slightly higher quality code using the former technique, but it would be an unrealistic workflow for a group of remote volunteers and could stagnate progress in the open source world.
Versus past projects
SB: The big difference is accountability. In a personal project you are of course trying to make efficient software and write clean code, but ultimately you only need to live up to your own expectations. As someone who is still quite new to the development world, that usually means the first thing that works. For an open source project however, you know your code is being reviewed and that it has to meet the standards of the community or it won't be merged. The result is that you aren't simply trying to so the first thing that works, but rather are spending considerable time making things the best they can be.
NF: I am going to have to agree with Scott here. With personal projects, I sometimes tend to sacrifice quality for convenience. With open source projects, my code has my signature on it and is on display to the rest of the world. Not only do I tend to strive for higher quality, but a code review process ensures that poor design choices or silly mistakes are corrected before merging to a master branch.
It is also nice to have a community that understands the software when you have a question. With personal projects, I tend to use Stack Overflow when I have a question, but without project scope, my question is more susceptible to the XY problem, where I get trapped into asking about a specific problem, X, when I should really be asking about the more general problem, Y. When working on OpenMRS, I have a helpful community that inherently knows the underlying context to the issues I face (and therefore, problem Y), and I have seen how much faster I resolve issues than when I work on my personal projects.
Feedback for projects
SB: This isn't necessarily for the community directly, but I think that new contributors should have to submit detailed plans on what they plan to implement and how. Even with something as simple as a big fix there can be a lot of different ways to go about something, and not all of them will be in line with the prerogatives of the principal contributors to a project. Of course, pull requests end up being where a lot of this gets dealt with, but it might be prudent to establish some of the details upfront for someone who is not familiar with the project.
NF: First and foremost, keep up the positive vibe. I was intimidated to find an open source project and join an experienced team, but the OpenMRS community was incredibly welcoming and friendly. As mentioned before, however, I would recommend that senior developers review the setup process and the installation guide to ensure it remains bug-free and that all of the core modules are still compatible with third-party technologies and each other (at least, when installed exactly as described in the guide instructions). I would also suggest a clean up of old and outdated wiki documents and JIRA issues which created confusion for me and several other developers (based on comments in the discussion forums and the Wiki page comment sections).
NF: Scott, Matt, and I had the unique opportunity to work on our respective open source projects for a self-directed university course. For me, it was nice to have an instructor who would help me set goals and a timeline and who would keep track of my progress. If another undergraduate computer science student had this option available, I would recommend it. If not, do not despair. There are many ways to seek active open source projects (I recommend looking into current HFOSS projects like the three of us did). Do not be put off by the thought that it is too daunting or that you are not ready yet. There are always projects that have great communities, helpful onboarding documents, and a plethora of beginner tasks for the exact purpose of getting you acquainted with the codebase.
This article is part of the Back to School series focused on open source projects and tools for students of all levels.