Welcome to the third installment of my monthly column, where I explore how open source software and the open source way are used in the digital humanities. Every month I take a look at open source tools you can use in your digital humanities researc, as well as, a few humanities research projects that are using open source tools today. I will also cover news about how transparency and open exchange, and principles of the open source way, being applied to the humanities.
Let's start with an explanation of the digital humanities. The digital humanities is where traditional humanities scholarship—or, the academic study of arts, language, history, and the like—meets the digital age. By using technology in new and innovative ways, digital humanities scholars can create research projects that explore topics in ways that were not possible (or were extremely laborious undertakings) before computers.
Text/data mining, visualization, information retrieval, and digital publishing are some of the key features of digital humanities research. With computers, it is possible to analyze text, discover patterns, and visualize data with relative ease. In addition, digital projects can be much more accessible to the general public than traditional scholarship. For example, digital humanities projects can build connections to the past, like the The Papers of Abraham Lincoln project is doing with the papers of President Lincoln and the Roy Rosenzweig Center for History and New Media will be doing by creating learning materials for the President Eisenhower E-Memorial.
In April this year, several new tutorials were published along with many other interesting developments. I have highlighted the most interesting of them below. Perhaps one will inspire you in your own digital humanities research, or help you learn about this interesting field of scholarly research.
Learn how to use Gephi
Gephi is a tool for creating visual representations of connections in a social network or similar data. One of the sample projects included with Gephi is a "co-appearance network of characters in Les Miserables." Another sample project looks at the dependencies of the Java programming language's classes. Basically, if there are connections between items in your data, Gephi can be used to visualize the connections.
Gephi is rather complex and can be overwhelming for new users. Thankfully, there are several nice tutorials to introduce new users to Gephi's functionality. In late March, Brian Sarnacki posted The Complete n00b's Guide to Gephi on his blog. By walking the reader through six key steps, Sarnacki's tutorial introduces readers to the Gephi's workflow. Another great tutorial is Miriam Posner's A fun way to introduce DH students to dataviz, which is a classroom exercise aimed at teaching students how to manage datasets and visualize them. In addition to covering Gephi, Posner's tutorial also touches on OpenRefine. Both tutorials are excellent introductions and should help you learn how to use this powerful open source tool.
Quantitative archaeology with open source software
Like many academic fields, archaeology makes use of statistical software packages for quantitative research. The most popular packages are, of course, closed sourced solutions, like SPSS and Excel. The Arc-Team Open Research blog makes the case for using R in place of those closed source options. The problems with the closed source options, as identified by the blog post, are propriety file formats, the difficulty of reproducing choices/options in a point and click environment, and the fact that the algorithms used in the statistical analysis are not open. The alternative is to use R, and the post highlights how RStudio, the "Hadleyverse" (a collection of R add-on packages), and the R community make R an excellent choice for archaelogists.
Hydra-in-a-Box joint initiative
The Digital Public Library of America announced that Hydra-in-a Box has received a $2 million grant from the Institute of Museum and Library Services. Hydra-in-a-Box is a joint project from the DPLA, Stanford University Libraries, and Duraspace with the goal to "produce a turnkey, Hydra-based solution that can be widely and easily adopted by institutions nationwide" with each organization providing a different area of expertise. Hydra is already a powerful solution for managing and sharing digital assets, and Hydra-in-a-Box will make the project even more approachable for end-users wanting to use Hydra to share digital objects.
The cost of customization
The Berkley Digital Humanities blog takes a fascinating look at using custom code vs. prepackaged solutions for digital humanities projects. The funding for digital humanities projects does not always cover the long term maintenance of the projects, and quite often the primary researcher/developer no longer takes an active roll in maintaining a project after it is finished. The blog post takes a look at DiRT Directory as example of a project that has changed hands several times, which means new people supporting the project. DiRT Directory used Drupal for their redesign in 2011, so the project's code has a broad support structure through the Drupal community, which they would not have it they used custom code (even if they used an open source framework to develop the custom code). Long term maintenance is something that should concern everyone, and this blog post is worthwhile reading even to those outside the digital humanities.
This is a monthly column on the state of open digital humanities. If you have news pertaining to this topic that you would like to share, please send an email to Joshua Allen Holm. If you would like to contribute an article on this topic, please send your submission to the Opensource.com editorial team.