What's next for open source question answering technologies

No readers like this yet.
Open ant trail


Grant Ingersoll is CTO at Lucidworks, provider of Fusion, but his claim to the open source community are his contributions to Apache Lucene, Solr, and Mahout. (He co-founded Apache Mahout in 2008 with the goal to build an environment for quickly creating scalable machine learning applications.) This year, Grant will be speaking at OSCON 2015 about building a next generation QA system with open source tools and about how to use Apache Solr for data science.

If you're interested in how Watson beat all of the Jeopardy contestants against him, read more in this interview about question answering (QA) technologies.

What are a couple examples of particularly innovative uses of question answering technologies right now? Are there projects that make you think 'I wish I'd thought of that first'?

We probably take it for granted already and it's only been a few short years, but when the likes of Siri and Google Now came out with their ability to give you answers right over your phone, I was pretty blown away because it combines and makes elegant (for the most part) so many things that need to happen in order to have a good user experience in a question answering system.

I think right now a lot of the effort is going into how to we make these systems more accurate and more capable of answering a wider variety of questions.

OSCON attracts a diverse mix of people working with open source technologies. What kind of attendees will be most excited by what they'll take away from your talk?

Both of my talks ("You've got Questions" and "Solr for Data Science") are geared towards technical people without prior experience in QA or Apache Solr. Developers will likely get the most out of the session, but both talks aim to show and discuss what concepts are behind the technology and where they fit in the stack.

You're going to talk about building a next-generation QA system. What is "next" for question answering technologies?

Bigger, better, faster, for the most part. We're still very early on in the types of questions these systems can answer, but given the progress of late in machine learning and artificial intelligence, the complexity of tasks these system can handle is growing quite rapidly. As I mentioned above, I'd say most of the effort is into making them higher quality and able to handle a wider variety of questions. Right now, most of these systems are focused on less complex answers (at least relative to how us humans answer questions), but someday perhaps they will be able to answer much more complex questions like the "compare and contrast" type that we all loved so much in high school.

Without giving away your whole talk, tell us more about the system that's able to answer real natural language questions that you plan to demonstrate.

The talk is primarily driven off of examples from my book, Taming Text. The system is built using Solr, OpenNLP and a few hundred lines of code we wrote for the book (and available on GitHub). It focuses on answering fact-based questions like "who is the President of the United States."It is designed to showcase the concepts without sweating the details of performance, etc.

Speaker Interview

This article is part of the Speaker Interview Series for OSCON 2015. OSCON is everything open source—the full stack, with all of the languages, tools, frameworks, and best practices that you use in your work every day. OSCON 2015 will be held July 20-24 in Portland, Oregon..

User profile image.
Rikki Endsley is the Developer Program managing editor at Red Hat, and a former community architect and editor for Opensource.com.

Comments are closed.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.