Should setting information free get you locked up? Aaron Swartz, JSTOR, and the theft of information |

Should setting information free get you locked up? Aaron Swartz, JSTOR, and the theft of information

Image by :

Aaron Swartz is 25 years old. He’s smart, tenacious, talented. And, in the view of the US Attorney General, a dangerous man currently charged with wire and computer fraud, obtaining information from a protected computer, and criminal forfeiture. He was released pending a September trial on a $100,000 bond.

His crime? Downloading more than four million documents from the JSTOR digital library.

What he did and whom he did it to

JSTOR, which stands for Journal Storage, is a not-for-profit organization that offers a collection of academic journals and publications for license to libraries, educational institutions, publishing companies, and individuals--for a fee. These organizations then turn around and offer JSTOR access to their members, students, or employees, usually through library services.

The Massachusetts Institute of Technology (MIT)--the network through which Swartz did his dastardly downloading--offers free access to its students and faculty. However, Swartz was not a student at MIT. He broke into a network closet to gain direct access and used a faux student account (Gary Host or, cleverly, ghost) to log in.  

In September 2010, Swartz began accessing and downloading the JSTOR files. Eventually, his rate of download prompted JSTOR to deny all MIT-identified computers access and raised the suspicions of JSTOR and MIT admins. Swartz was found out, and convinced to turn over the hard drives that stored the data he’d copied. MIT decided not to press charges once the material was returned and Swartz agreed to make no further efforts to distribute it. The US Attorney General’s office chose to continue the prosecution.

Show me the money

A $100,000 bond is pretty high. There are lots of crimes for which the average bond is far less. Assault. Battery. Rape. Manslaughter. (Murder typically is unbondable, or at least a cool million .) Theft bonds out between $5,000-$50,000, depending on the value of the items stolen.

So what was the value of the documents that were taken? According to the indictment, JSTOR's annual subscription fee for a large research university would cost more than $50,000. "Portions of the subscription feeds are shared with the journal publishers who hold the original copyrights," it continues. The publishers set the fees for their articles and choose which ones can be purchased individually.

Value is also dependent upon rarity. In this case, JSTOR’s files remained unchanged and (mostly, save for the outage at MIT) available. What Swartz was attempting to take from JSTOR was the ability to enforce rarity by charging a premium for access.

Who is Aaron Swartz?

Swartz co-authored the RSS 1.0 specification before he was old enough to drive. He got into Harvard and left after a year to start Infogami--a tech company later bought and integrated into Reddit, which was then bought by an arm of Condé Nast. He started another company, Jotit. He is an activist and co-founded Demand Progress, a largely digital, progressive political organization.

And why is he doing this?

Swartz has been under this kind of suspicion before. The FBI got involved in 2009 when he--with the encouragement of Carl Malamud and others--began downloading and openly redistributing documents from PACER (Public Access to Court Electronic Records), which was in the midst of a trial run of openness.

PACER documents had been made available to a small number of libraries (17), and Swartz had managed to download 20% of the available info before the service was shut down. Though Swartz was investigated, no charges were filed.

Swartz, Malamud, and others felt the PACER documents and other government-generated records--many of which are legally required to be available to the public--should be made easily accessible, on the public web, where they could be freely copied and indexed by search engines like Google.

JSTOR is an archive of largely academic articles, many of which are based on research funded by public grant or written by people working for state-sponsored universities. Some people, like Swartz, believe that this puts them on the same level as any other public record and that they should also be freely available online.

But even if the information should be public, the process of moving records online encounters roadblocks. The fees that are generated through requests for information help pay for maintenance. Converting large volumes of traditionally stored information to digital form is arduous, especially since some must be redacted or otherwise altered to protect the personal, private information of involved parties. And some types of information are controlled by knowing who has access to them.

Repercussions of the JSTOR case

The revolution in information technology affects everyone. Despite objections, the way we do things involving data has and will continue to change. For instance, who bothers to look up something in an encyclopedia anymore? How many people still visit the bank teller to find out their balance or deposit a check? We trust PayPal and Amazon with our financial information, and it is clear that even public agencies take advantage of the ability to record, store, secure, and distribute private information over the Internet. 

Swartz's goal in his actions was to accelerate this process. When it came to JSTOR, in his view, the roadblocks have already been overcome. He believes the documents should be publically available because their contents were in part publically funded. The work of digitizing them has been done, and they do not contain private information.

“It’s like trying to put someone in jail for allegedly checking too many books out of the library," said David Segal, executive director of Demand Progress, a political action group Swartz helped found.

But then, most library patrons don't check out a pile of books after first getting a fake library card and stealing the key to the library. Swartz has been indicted on multiple charges, including computer fraud and unlawfully obtaining information from a protected computer. The charges could result in up to 35 years in prison and a $1 million fine.

Despite his methods, the case poses several questions about the cost of access and value of information. The amount of Swartz's bail suggests a far more serious crime--theft of valuable property--than the charges brought against him. The verdict may tell tell us something about how much information is worth.

Read more


About the author

bascha - Editor, writer, and developer. I wear many hats, including the red one. Graduate of UNC-Chapel Hill School of Journalism; long-time interest in all things geeky. Editor of Red Hat Magazine and grizzled industry veteran, including time as an archivist for SunSITE UNC (now and ten-plus years at my current gig. I love: