Sunlight Foundation's Eric Mill scouts out new developments in government | Opensource.com
Sunlight Foundation's Eric Mill scouts out new developments in government
Interested citizens and government professionals, meet your new pal, Scout. It sends you notifications when new developments in government happen—your government, your departments of interest, your items of relevance.
We caught up with Scout's creator, Eric Mill, a web and mobile developer at Sunlight Foundation, to give us the details of the technology powering Scout and some explanation to why we thought this tool already existed.
Mill is an expert at developing technology that makes government more transparent and avid about open source projects.
Why should people use Scout?
Scout rapidly searches all kinds of government activity—bills, regulations, speeches—at the state and federal level, and can notify you about all of it.
This is an immense amount of information. Scout's goal is to quickly find the needles within that haystack that are relevant to you. If you care about an issue, be it as an environmental activist, a hunting enthusiast, or a legislative affairs director for a company–this is a vital function.
I built Scout as part of my work at the Sunlight Foundation, and we created it in part for our own use: Sunlight's policy team follows all sorts of open government issues, from campaign finance to the Freedom of Information Act (FOIA), and they want to know about new developments as soon as possible.
Part of this is following simple keywords, like "FOIA" and "lobbying". But our policy team, like many professionals, have the expertise to know that certain kinds of changes will be accompanied by very specific legal language. We engineered Scout to be able to search very quickly over large amounts of legal text. This has already directly led to successes in their work, and we think Scout will be extremely helpful for citizens and professionals around the country.
Why are you passionate about Scout?
The kind of pan-governmental search and notification service that Scout provides is so foundational and obvious that a lot of people overlook that it doesn't exist, or they assume that it does already. There are definitely free services out there that provide pieces of it, but tying this breadth of government information together in a way that's both approachable and powerful is a problem we don't feel is solved yet.
Where this level of service does exist, it costs a lot of money. The political intelligence industry is quite lucrative, and there must be a reason so many powerful companies and organizations are paying for dashboards and alerts that track their issues.
Before I built Scout, I built an Android app for Sunlight, called Congress. I did it on a whim, and there was some internal skepticism about its value. The app turned out to be a surprise success, both in raw downloads and the diversity of the user base–lots of citizens and professionals both. I was also surprised by how often people took advantage of push notifications–almost half the time, users are opening the app by tapping a notification.
I once got a thank-you email from a lobbyist who worked for a tiny non-profit that advocates for safer walks to school for kids, who carried the Congress app with her on Capitol Hill every day. It was extremely gratifying, and made me want to build things that were so useful that professionals would want to depend on them in their work–especially those who maybe don't work at places that can afford expensive pay services.
Give our technical folks a sneak preview of what's under the hood.
The actual data that Scout searches and alerts over, however, comes from other sources–Scout is a live API client. When you do a general search on Scout for "open source," it hits four different remote API endpoints in parallel to find your results, then quickly parses, transforms, and displays results from each one as they come back.
For state legislation, it hits Sunlight's Open States Project, which has an API and is open source. For speeches in Congress, it uses Sunlight's Capitol Words, which also has an API and is open source. Our information on legislation on Congress comes from a wide variety of sources at different intervals, and we pipe it through a low key API (once again, open source) that I built over all sorts of real time Congressional information. Finally, for federal regulations, we index content from the amazing FederalRegister.gov API, an official API of the US government that streams every proposed and final regulation in JSON, throughout the day. And yes, FederalRegister.gov is also an open source government website.
The Scout website uses MongoDB for a backend, which is completely ideal for a consumer of remote JSON-based APIs, because the contents of those APIs can be dumped more or less directly into the database, and transformed only upon render. This keeps the data pipeline small, maintainable, and cognitively simple to work with.
All of the APIs that Scout directly hits use a Lucene-based full text search engine (mostly ElasticSearch) and allow clients to run "query string" queries, which is what lets Scout offer an "advanced search" option that allows Lucene query string syntax.
What aspirations do you have for Scout? How do you see Scout improving open government?
In the near term, there's still lots more information that could be flowing through Scout that isn't yet. We'll be adding support for public comments on regulations, various types of important documents and reports, draft legislation, and much smarter searches for legal citations.
Longer term, as more people use Scout, I'm interested in finding ways to surface people's areas of expertise to other users. People who've worked in a field for a while know the relevant terms of art and specific sections of the law to follow, and they also know what terms to exclude to remove "noise." Transferring this to newcomers could be in the form of suggested searches, bills to follow, things like that, using aggregate data to protect privacy.
Another approach we've experimented with is creating curated sets of alerts, using a basic tagging system, that lets you put your name and a description on a collection of searches, bills, feeds, whatever, that you think others who care about the issue should follow. Users visiting that collection can subscribe to everything in it with a single click, and automatically stay subscribed to any additions. We haven't invested as much effort into this aspect of the site, but we may expand it and try to make a compelling platform for people to demonstrate and share their expertise in a very direct way.
More broadly, I'd like to see people take away from Scout that simple can still be powerful when it comes to making government data actually useful to people. It's our hope that Scout is approachable enough for even mildly engaged citizens to feel more empowered, and capable enough that professionals feel comfortable relying on Scout to do their job.
How do you use the open source way in your everyday life?
Everything I make in my spare time is open source, even the disproportionately complicated PHP code that powers my Christmas detection site. At work and at home, everything I do is built on stacks of open source software, and I owe those projects countless man hours of my life that I did not need to expend on the foundation they provided.
The free, collaborative culture that has sprung up out of open source, especially as exemplified by Github, is a daily inspiration to me. The barrier to helping a friend, creating a side project on a weekend, and contributing to something larger than yourself has never been lower, and it is catalyzing insane amounts of activity.
It is all the more gratifying to see that companies like Github (and Red Hat) can be so successful as businesses while aligning their incentives firmly with the open source community.
Author's note: A previous version of this interview was incorrectly published. The original interview has been restored. We apologize for this inconvenience.