In case you haven't noticed, it's an election year in the United States. And with the election in full swing, there is a plethora of data, from a myriad of sources, about what is on the mind of the electorate, what is driving voters to make their decisions, and how they are likely to vote.
In an earlier phase of my career, I managed survey research. And a constant source of frustration was the difficulty in aggregating publicly available data and managing it outside of structured data formats.
So, when the Huffington Post announced recently that it was releasing the HuffPost Pollster API it definitely caught my attention. It lets software developers access the information about public polls that the HuffPost Pollster team gather and publish on the Huffington Post.
One impetus behind this initiative is greater transparency about the information that is reflected and reported in public polls. "Since being able to understand the methodology behind opinion surveys is an important step toward increasing transparency in the opinion polling industry," the HuffPost Pollster team says, "we're including information about the methodology for each poll. And to make the data independently verifiable, we've included a link to the original source that conducted or reported the poll along with each entry."
"While there are several news organizations that aggregate and analyze polling data, as of yet none have made the underlying numbers that they collect available in a format for other software developers to build programs with. We hope that in making this data accessible we can not only help journalists, researchers and policy analysts better understand current opinions and trends, but also empower them to shed greater light on the limitations and biases of polls and the organizations that conduct them."
The initial release is big. It includes more than 215,000 responses to questions from more than 13,000 polls, which the HuffPost Pollster team has organized by subject and geography into more than 200 charts. Per their announcement, "the data feeds operate in real time, so shortly after we add a new poll to our database, it'll appear in the HuffPost Pollster API's responses and calculations."
Adding to the coolness is that the effort relies heavily on open source tools. The HuffPost Pollster team is publishing the data as an HTTP-based application programming interface, or API, with JSON and XML responses. They are releasing the data under a creative commons license.
I encourage you to read the entire announcement made in early July, "HuffPost Pollster API Enables Open Access To Polling Data."
And true to the open source way, they want to see "what people build with this data. Please send us suggestions on how we can improve the HuffPost API and let us know if you use it in your work."