Making Public Records Public: Why open formats are essential for sharing and preserving government data. |

Making Public Records Public: Why open formats are essential for sharing and preserving government data.

Image by :

By Chander Kant, CEO Zmanda.

Have you ever tried to retrieve a public record from your local, state or federal government? Despite their name, many public records have not been simple or free for citizens to access. Until recently, obtaining copies of even the most basic records has been a grueling process.

There are two layers of barriers in getting to data stored at a government agency:

Political or Bureaucratic: Much of public data remains buried beneath myriad of byzantine processes. Recent laws and citizen activism are breaking many of these barriers. Of course, some government data may not be available because of reasons like national security.

Technical: Despite best political and bureaucratic intentions, several technical factors may limit access to public data. This is especially the case for archived data, as it may be stored in a format which is not understandable by modern applications or stored on a media which is not easy to access anymore (think about reading a floppy hiding in your storage).

Early initiatives toward making information available to the public such as the freedom of information legislation have overburdened government agencies with requests for legacy information that is inaccessible or stored in multiple incompatible systems.

With the advent of the Internet, some government agencies began sharing records online, however, what was posted was extremely limited, incomplete, and stored in a format that was impossible to use for anything beyond a cursory search of individual records.

Breaking the Political Barriers

This inefficiency became the catalyst that led the current U.S. administration to draft the Open Government Initiative and subsequent Directive in 2009 that states the following:

To the extent practicable and subject to valid restrictions, agencies should publish information online in an open format that can be retrieved, downloaded, indexed, and searched by commonly used web search applications. An open format is one that is platform independent, machine readable, and made available to the public without restrictions that would impede the re-use of that information.


Adoption of the Open Government Directive is a significant change for a government that has been inundated with legacy applications and incompatible systems. It is a clear articulation of political will to getting rid of barriers to access public data. This initiative will steer the government away from using proprietary formats which are hard to preserve, hard to re-use, and typically require either expensive proprietary software or only operate on specific platforms.

 The Value of Openness

The Open Government Initiative marries the intent behind freedom of information legislation of the 1960’s with the technological advancements of today. Making data open and readily available to the public has numerous benefits. When citizens become informed, they get more engaged in the operations of the government and participate more in its decision making. It also encourages commercial re-use of the information for new and innovative applications that will help stimulate the economy.

Open data is a valuable resource to society, but simply posting it online is not enough. For data to be truly open, it must be stored in an open format through which it can be shared, repurposed, archived, and retrieved without risking obsolescence, unintended technological limitations, or requiring the use of proprietary applications. The format of the data determines the value of the resource and the extent to which it can be analyzed and repurposed.

Governments have learned this lesson the hard way as many of their archives are stored in legacy formats that are more than 30 years old. Many archives are assumed to be accessible for perpetuity. Building a process that allows these records to be retrieved and made available in a searchable open format will be no small undertaking. This new directive is designed to ensure that we don’t continue to compound the problem in the years to come.

 Applications for Government Data

When considering new ways that public data can be used, the opportunities are endless. By opening up its accessibility, creativity and innovation regarding its use, public data is no longer confined within the walls of government offices, but unleashed through the minds of the masses.

Sharing the data online in a well defined format allows for data “mashups,” where information from multiple sources is integrated to create a new application. Correlations are made between the data to create new uses and possibly even new kinds of industries.

 Several examples of data mashups, using public records, exist today, including the American Lung Association’s use of the EPA’s Air Quality System database to create a State of the Air report. Another example is EveryBlock’s mashup of crime report data and Google Maps to create a mapping tool with greater sophistication that empowers citizens to learn more about their community.

Breaking the Technical Barriers

In concert with the Open Government Initiative, the United States federal government launched that as of March, 2010, housed about 170,000 data sets from 47 different federal departments, agencies and regional institutions. The data stored on this website is publicly accessible and is stored in open formats so that it can be easily viewed and exported for use in other applications. The concept behind this site quickly became a worldwide inspiration with several cities, states and international countries launching similar sites for sharing public sector data. E.g. launch in early 2010 was very well received, and is considered by many to be easier to use as compared to its US predecessor, especially by application developers.

 There has been much debate about what makes a data format really open. Of course, a cleanly laid out standard blessed by an independent standards organization is must-have. But, in my opinion, having an open source software providing a reference implementation on accessing and manipulating that data makes it completely open. Otherwise, a complicated definition can be interpreted in different ways, causing accessibility headaches few years or decades from now.

Governments worldwide have taken great strides in improving the accessibility of public information in the last year, but it’s just the beginning. Opening of government information will be a gradual process, but these initial steps and realized success have enough momentum to accelerate this process – making both governance and citizenship better in near future.


About the author

chanderkant - Chander is the CEO and founder of Zmanda. Chander provides a unique combination of leadership in open source and data protection software. He has been involved on both the technology and business sides of open source software and was named one of the "Top 20 Linux Luminaries" by Linux World Magazine in 2004. Prior to Zmanda, Chander founded and ran LinuxCertified, Inc., an open source product and services company.