In this section we present a description of the components that show up in the conceptual QA workflow of the QALL-ME Framework as depicted in Figure 2.1. If you are rather interested in the actual implementation of these components for the framework, then you might directly head over to Chapter 4: “Software Reference”. Please note, however, that general explanations about the concepts behind the components in the framework are only given here, while the software reference is mainly intended for technical documentation of interfaces and the like.
As soon as a user asks a question – no matter through which
UI –, then the QA
planner receives the user’s inquiry consisting of the question and the
question’s context (cf. Receive Question and
Context
component). Here the inquiry is put into a
question object QObj on which the
planner works in the following steps. A first addition to the
QObj is then the language of the
inquiry which is automatically recognized by a
Language Identification
component.
Depending on the location
of the inquiry, the QA planner now selects an appropriate
Entity Annotation
component to
let it annotate entity names in the question which depend on the available
answer data. Such entity names might be cinema names or names of movies
which can be seen in local cinemas. Thus, the types of entities that have to
be annotated here depend on the domain of the QA system
which is built.
A second annotation component follows in the QA
planner’s workflow: a Term
Annotation
component is selected according to the
question language and annotates language specific terms that are relevant
for the QA system’s domain. The difference to the
previously employed entity annotation component is that the latter will
usually annotate named entities only that are more or less language
independent. The term annotator, however, annotates terms that are used to
express concepts only in certain languages. Consider for example hotel
facilities: terms for concepts like TV, swimming pool, safe, etc. are
represented differently in different languages, e.g., an English “hair
dryer” is equivalent to the German “Fön”. Location specific (named) entities
like “Berlin” or “New York” are mostly the same, however, even when they are
used in different languages; that’s why there is the
Entity Annotation
component.
To complete the annotation of the input question, the
QA planner selects an appropriate
Temporal Expression Annotation
component which is used to annotate temporal expressions in the input. Such
temporal expressions are similar to the previously mentioned terms in that
they are highly language specific. For example, the English “yesterday” and
the German “gestern” refer to the same time when used in the same context.
One last thing should be noted regarding the three annotation
components: they not only annotate but also find a canonical, language
independent or normalized form of each annotated entity, term or expression.
A language dependent temporal expression, for example, is normalized to a
common format like TIMEX2 or ISO 8601. With
these canonical forms it is a lot easier for the following
Query Generation
component to
translate that expression into an answer search query.
After the annotations, the QA planner is now ready to
generate a query for finding an answer to the input question. It chooses a
suitable Query Generation
component for the question language and – using a component for the
Recognition of Textual Entailment (RTE) as described in
section 2.1.5: “Recognizing Textual Entailment for QA” – it then generates a
SPARQL query that is applicable to the available answer
data. As detailed in the mentioned section, this step is one of the most
important steps of the workflow in the whole QALL-ME
Framework because it builds the actual bridge between information from the
input question and potential answer data.
If no query could be generated, then the user has probably posed an
out-of-domain question and an empty answer is returned (cf.
Create Empty AObj
component).
Otherwise the SPARQL query is fed into an
Answer Retrieval
component. A
suitable component implementation for this is again selected by the
QA planner according to the spatial context of the
question. The answer retrieval component is connected to a location
dependent RDF fact database from which it retrieves answers
that the QA planner eventually returns.