As soon as you start to seriously think about building your own question answering (QA) system based on the QALL-ME Framework, you will also have to think about implementing your own versions of the framework’s web service (WS) components. Sometimes it may be possible to derive your components from the demo component implementations that we supply in the DemoKit (cf. 4.1.1: “Stable Releases”), however, especially when moving to a new language you will probably be bound to (re-)implement your own component versions. In this section we will show exemplarily how such an implementation can be done from the bottom up.
The example WS component that we would like to implement
will be a new TermAnnotator
for English; for detailed
information on the TermAnnotator
component see section
4.2.3.1. We
will follow a rather simple, dictionary-based approach: only terms which are
contained in a given list of terms – the dictionary – will be annotated; for
simplicity we will furthermore only annotate one type of terms.[8] The actual finding of terms in input questions shall be handled by
the free OpenNLP
library.
In order to be able to closely follow this tutorial you should now get
yourself a copy of the OpenNLP library. At the time of writing, version 1.4.3
was the latest release of that library which we also use here in the tutorial.
After downloading and unpacking the archive you have to build the library first;
therefore you can simply run Apache Ant on the provided build file in the root directory of your copy
of the library. A new directory output
will be created
containing both the compiled classes as well as a JAR file containing the full library. Add either the
compiled classes or the JAR file to your Java build path.
For this tutorial you also need the JAX-WS API on your Java build path. Section 3.1.4: “Java Web Service API And Tools” provides a brief summary on where to get this API.
Lastly we will also make use of existing code pieces from the
implementations of the DemoKit or strictly speaking rather only from the
Util
class in the de.dfki.qallme
package. So this class should be on your build path for this tutorial, too.
[8] Adding the possibility to annotate an arbitrary number of term types should be pretty easy, though, and is left as an exercise for the reader.