Chris Mattmann is a frequent speaker at ApacheCon North America and has a wealth of experience in software design, and the construction of large-scale data-intensive systems. His work has infected a broad set of communities, ranging from helping NASA unlock data from its next generation of earth science system satellites, to assisting graduate students at the University of Southern California (his alma mater) in the study of software architecture, all the way to helping industry and open source as a member of the Apache Software Foundation (ASF). When he's not busy being busy, he's spending time with his lovely wife and son, braving the mean streets of Southern California.
In this interview, Mattmann previews what he'll discuss at ApacheCon in Austin.
You're not a long-term member of the Apache Software Foundation. What motivated you to get involved in open source and the ASF?
I’ve been involved in the ASF since 2005 when I got involved in the Apache Nutch project. I was a PhD student at USC taking Search Engines class and also working at NASA JPL. My final project in the class was an RSS parsing plugin (NUTCH-30) that got integrated. It was a budding, awesome community, and I got more and more excited after my patch and started helping out on the lists. I also saw a big use for Nutch and what eventually became Hadoop at NASA.
One of your ApacheCon Austin talks is called "If you have the content, then Apache has the technology!" That's a bold statement. Without giving too much away, what do you plan to cover in your talk?
I plan on giving an overview of the available Apache Stack of Content technologies: Tika, UIMA, Lucene, Nutch, Solr, ODFToolkit, cTAKES, and more.
Viewing from 10 miles high, what will lead the content management scene in the future: closed source or open source?
My goal, as I outlined in my 2013 Nature paper, is to create and promote "digital babel fish" technology. I think we need to open all the formats, and beyond that we need to create "mediator" technologies that will extract text, metadata, language information and provide means to deal with it automatically and reliably.
Why does this talk focus on "Apache has the technology" when Apache is "community over code?"
Go ask Nick Burch, who has given this talk at many ApacheCons. I think it was more for fun than anything else, but also to highlight the different content technology "communities" that are at Apache.
"If you have the content, then Apache has the technology!" is one of the traditional talks at ApacheCon. Can listeners expect some exciting news?
If I can get the time, I’d love to try and do a demo. In one of my prior ones, I was running rm -rf commands live. I only hope to have the same level of success.
This article is part of the Speaker Interview Series for ApacheCon 2015. ApacheCon North America brings together the open source community to learn about the technologies and projects driving the future of open source and more. The conference takes place in Austin, TX from April 13-16, 2015.