Join the 85,000 open source advocates who receive our giveaway alerts and article roundups.
Twitter versus traditional health-tracking methods
Tracking real-time health with Twitter data serves as an early warning system
Get the newsletter
As the open source ethic has changed the way that we share and develop resources, crowdsourcing is redefining how we can create new resources based upon that willingness to share. One example of crowdsourcing at work for the betterment of us all is public health researchers turning to Twitter to collect real-time data about public health.
Researchers at Johns Hopkins University have developed a new software algorithm that allows them to filter the tweet stream for health references, then sort the results into health categories. Researchers Mark Dredze and Michael Paul tested their algorithm on two billion tweets, then analyzed the resulting 1.5 million health-related messages. They found patterns related to flu, allergies, insomnia, depression, cancer, pain and several other ailments. Because location information is available for tweets sent from GPS-equipped mobile devices, the researchers were able to pinpoint the origin of many health messages. The goal of the project is to predict when and where illness will spread in order to give local public health departments time to plan and allocate resources. Dredze and Paul believe that using Twitter data provides a more current snapshot of public health than is currently available from the Center for Disease Control (CDC).
Filtering pertinent health data is the most challenging part of the project. Twitter averages 340 million daily tweets that are mostly reactions to new events. When a flu outbreak becomes a news event, a certain number of tweets will be commentary on the outbreak rather than people talking about having flu symptoms. This means that the filtering algorithm must do more than search for health-related keywords; it also must filter out "noise" that doesn’t pertain to personal health.
The Johns Hopkins project is not the first effort to use Twitter to gather health data. In mid-2012, the U.S. Department of Health and Human Services issued a challenge to developers to design web-based applications that tracked health trends in real-time using Twitter data. The challenge, which was named Now Trending: #Health in My Community, was inspired by studies of the 2009 H1N1 flu pandemic and 2010 cholera outbreak in Haiti that demonstrated how social media can identify disease outbreaks earlier than traditional tracking methods.
Out of a field of 33 applications that were submitted to the Now Trending challenge, MappyHealth was chosen as the most innovative, scalable, dynamic, and user-friendly entry. Developed by Social Health Insights LLC, MappyHealth tracks 234 health-related terms in the tweet stream in real-time and applies a set of 29 conditions to determine if a message is relevant. Once data analysis is complete, information is presented by the application in a variety of formats. In addition to a web-based application, MappyHealth offers mobile apps for smartphones and tablets. The MappyHealth FAQ provides examples of health trends that have been identified by trend spikes in Twitter data.
A third application that uses Twitter to track public health data is Germ Tracker. Developed by University of Rochester researcher Adam Sadilek and collaborators, this application filters tweets to identify which may belong to people who are sick, then displays the tweets as dots on a national map. The app itself invites social interaction: When users find a tweet that doesn’t actually pertain to illness (such as someone saying that they’re sick of school or their new car is "sick”), they can click on a button to alert the app that the tweeter is or is not actually sick.
Despite the promise that these apps seem to hold, the current state of research indicates that there are limitations to Twitter's usefulness as a tool for monitoring public health. Twitter users represent a cross-section of the population that doesn't include all locations, socio-economic classes, and age groups. Two of the largest demographic groups that are underrepresented in Twitter are among the most likely to be affected by flu outbreaks: children and the elderly. According to Johns Hopkins researcher Michael Paul, rather than providing a complete picture of current health trends, these apps are more valuable as early warning systems that can deliver health data much faster than data-gather techniques used by the CDC.