2.3.2. System Components

In this section we present a description of the components that show up in the conceptual QA workflow of the QALL-ME Framework as depicted in Figure 2.1. If you are rather interested in the actual implementation of these components for the framework, then you might directly head over to Chapter 4: “Software Reference. Please note, however, that general explanations about the concepts behind the components in the framework are only given here, while the software reference is mainly intended for technical documentation of interfaces and the like.

As soon as a user asks a question – no matter through which UI –, then the QA planner receives the user’s inquiry consisting of the question and the question’s context (cf. Receive Question and Context component). Here the inquiry is put into a question object QObj on which the planner works in the following steps. A first addition to the QObj is then the language of the inquiry which is automatically recognized by a Language Identification component.

Depending on the location of the inquiry, the QA planner now selects an appropriate Entity Annotation component to let it annotate entity names in the question which depend on the available answer data. Such entity names might be cinema names or names of movies which can be seen in local cinemas. Thus, the types of entities that have to be annotated here depend on the domain of the QA system which is built.

A second annotation component follows in the QA planner’s workflow: a Term Annotation component is selected according to the question language and annotates language specific terms that are relevant for the QA system’s domain. The difference to the previously employed entity annotation component is that the latter will usually annotate named entities only that are more or less language independent. The term annotator, however, annotates terms that are used to express concepts only in certain languages. Consider for example hotel facilities: terms for concepts like TV, swimming pool, safe, etc. are represented differently in different languages, e.g., an English “hair dryer” is equivalent to the German “Fön”. Location specific (named) entities like “Berlin” or “New York” are mostly the same, however, even when they are used in different languages; that’s why there is the Entity Annotation component.

To complete the annotation of the input question, the QA planner selects an appropriate Temporal Expression Annotation component which is used to annotate temporal expressions in the input. Such temporal expressions are similar to the previously mentioned terms in that they are highly language specific. For example, the English “yesterday” and the German “gestern” refer to the same time when used in the same context.

One last thing should be noted regarding the three annotation components: they not only annotate but also find a canonical, language independent or normalized form of each annotated entity, term or expression. A language dependent temporal expression, for example, is normalized to a common format like TIMEX2 or ISO 8601. With these canonical forms it is a lot easier for the following Query Generation component to translate that expression into an answer search query.

After the annotations, the QA planner is now ready to generate a query for finding an answer to the input question. It chooses a suitable Query Generation component for the question language and – using a component for the Recognition of Textual Entailment (RTE) as described in section 2.1.5: “Recognizing Textual Entailment for QA” – it then generates a SPARQL query that is applicable to the available answer data. As detailed in the mentioned section, this step is one of the most important steps of the workflow in the whole QALL-ME Framework because it builds the actual bridge between information from the input question and potential answer data.

If no query could be generated, then the user has probably posed an out-of-domain question and an empty answer is returned (cf. Create Empty AObj component). Otherwise the SPARQL query is fed into an Answer Retrieval component. A suitable component implementation for this is again selected by the QA planner according to the spatial context of the question. The answer retrieval component is connected to a location dependent RDF fact database from which it retrieves answers that the QA planner eventually returns.