Studying polar data with the help of Apache Tika

No readers like this yet.
Igloo house on land

Photo by Derek Thomas, CC BY-SA 2.0

In mid-April, members of the open source community will gather in Austin for ApacheCon North America where Annie Bryant Burgess, a postdoctoral fellow in the computer science department at the University of Southern California and project assistant at NASA's Jet Propulsion Laboratory (NASA/JPL), will give a particularly interesting talk.

Annie Burgess headshot ApacheCon 2015Annie has a PhD in geography with a focus on satellite remote sensing of snow and ice, and in the past year she has become an Apache Tika PMC committer and advocate for the involvement of women in ASF. In this interview, she offers a preview of her talk, explains how her PhD is related to her involvement in open source, and tells us what Apache Tika has to do with studying polar data.

How is your PhD in geography connected to your involvement in open source?

For the past 10 years, I have straddled the divide between Earth science and informatics. My PhD focused on remote sensing and snow hydrology, but I entered the world of data science and software development when faced by challenges in processing and distributing the immense amounts of data produced by my research. Fortunately, I was lucky. I had the opportunity to collaborate with a group of computer scientists at NASA/JPL who helped guide me into the world of open source software and the Apache way.

I am currently a postdoc at the University of Southern California, working with Dr. Chris Mattmann as part of a team tasked with building open source cyberinfrastructure tools that help polar scientists find relevant data. My main task is to develop code (as a PMC committer for Apache Tika) for the unique needs of the polar science community, specifically, expanding text and metadata extraction for scientific data formats.

Your bio says "evangelist for the involvement of women in ASF." What efforts are you and the Apache Software Foundation making to get more women involved?

For starters, I’ll talk to anyone who will listen about getting more women involved in tech! At ApacheCon this year, I’m organizing the first Women of ASF luncheon. While this is a small step, it will provide a forum for us to come together and address, as a community, how to get more women involved in ASF.

Your talk at ApacheCon is about cool insights into polar data. Without giving too much away, what do you plan to cover?

With so many data portals and repositories available, it is often difficult for polar scientists to find the data they need. I am going to show some of the new features we have added to Apache Tika to make polar data more searchable. I’ll also show some examples of how these new features can help polar scientists pose new types of search queries. Have no fear—I’ll also show some photos of cold, icy places and the “cool” data that we collect there.

You'll be discussing Apache Tika in your talk. What other open source technology do you use to gather polar data?

Apache Nutch and Apache Solr!

ApacheCon North America 2015 will be held April 13-16 in Austin, Texas. You can catch Annie's talk on Monday, April 13.

ApacheCon 2015
Speaker Interview

This article is part of the Speaker Interview Series for ApacheCon 2015. ApacheCon North America brings together the open source community to learn about the technologies and projects driving the future of open source and more. The conference takes place in Austin, TX from April 13-16, 2015.

User profile image.
Jan Iversen is Danish, lives in Spain and has developed software since 1975. He is a member of the Apache Software Foundation (ASF), Chair of LABS and Apache OpenOffice, and commiter/PMC in several projects. Jan helped start Corinthia, a new generation of word processing. His main focus is to help make the ASF an even better place for open source projects.

Comments are closed.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.