Chapter 2. Framework Description

Table of Contents

2.1. Overview
2.1.1. Multilingual QA
2.1.2. Structured Answer Data Sources
2.1.3. Use of a Domain Ontology
2.1.4. Spatiotemporal Anchoring of Questions
2.1.5. Recognizing Textual Entailment for QA
2.1.6. Service Oriented Architecture
2.2. Prerequisites
2.2.1. Target Audience
2.2.2. Software Requirements
2.3. System Architecture
2.3.1. Question Answering Workflow
2.3.2. System Components
2.4. By the Way: What is QALL-ME?

The QALL-ME Framework is an architecture skeleton for multilingual question answering (QA) systems that answer questions with the help of structured answer data sources from freely specifiable domains. The language barrier is crossed with the help of a domain ontology. Provisions are made to easily anchor questions in space and time. Finding the mapping between question and answer – which is the major step in all QA systems – is done using a novel approach which is based on textual entailment recognizers. The framework is based on a Service Oriented Architecture (SOA) which is realized using web services.

The QALL-ME Framework is free software and comes with a set of demo components which illustrate the potential of the approach and which help new developers to get started with the framework. The framework seeks to be compliant with standards as far as possible in order to enhance interoperability and ease of use.

If you feel that certain points of the implementations in the framework do not suit your needs, then you can always go ahead and build a custom implementation which is still based on the QALL-ME Framework idea. We believe that the general ideas behind the framework are a crucial aspect of QALL-ME which can be realized in lots of different ways – the QALL-ME Framework is just one way! In particular it is a way of providing a basic realization in the form of an architectural skeleton as described above. The aim of this particular way is to get you started with your own QA system relatively fast.

The following sections provide a closer look at the ideas and main features of the QALL-ME Framework as listed above. If you are more interested in the technical details of how the framework works, then you might also directly head over to section 2.3: “System Architecture”; please note, however, that this section assumes a basic knowledge of the ideas behind the QALL-ME Framework which is only explicitly provided here.

Traditional QA approaches for structured answer data often made a deep analysis of the question in order to get a logical form which was then translated to a query which is suitable for application on the answer data. For various reasons[2] this approach has been found to be inadequate or even infeasible. In the QALL-ME Framework we are addressing the mapping between natural language question and database query differently using RTE (Recognizing Textual Entailment) components. By using RTE components the problems of finding logical representations of questions and then mapping these to database queries is bypassed through semantic inference at the textual level.

An RTE component can recognize whether some text T entails a hypothesis H, i.e., whether the meaning of H can be fully derived from the meaning of T. In the context of the QALL-ME Framework H is always more or less a minimal form of a question about some topic and T is more or less the question which has to be answered. Through RTE we can now find out which minimal questions are contained in a given question. The set of minimal questions has to be defined in advance, i.e., each (minimal) question that shall be answerable has to be in the set. Through RTE we can then handle all reformulations of these minimal questions. One crucial part is still missing on our way to the answer – the step from a minimal question to the appropriate answer. In principal that’s easy: we simply attach the correct answers to each minimal question and return that answer if the minimal question is entailed by the user question.

As you may have guessed already, practically the above described process is a bit more complicated. Instead of using minimal natural language questions we use patterns of minimal natural language questions. To get these patterns we replace certain entities in the minimal questions with placeholders. Furthermore these patterns are not directly mapped to answers but rather to database query patterns containing the same placeholders as the question patterns. Here is how the question processing works then: we first replace all entities in the user question with placeholders. Next the RTE component tells us which of the (minimal) question patterns in our set is entailed by the question pattern we have just created from the input question. In the entailed pattern and the corresponding database query pattern we replace the respective placeholders with the entities that we had removed before from the original question. Now we have a complete database query which can be easily applied on the answer data.

As the query generation step using RTE is the central part of the QALL-ME Framework, we should take the time to go through an example of this step. Our imaginary sample QA system supports English input questions, i.e., before it can answer any questions, there has to be an RTE component for English. Furthermore, a set of minimal question pattern to query pattern mappings has to be created for all English question types that shall be answerable. Let’s assume we have the following mappings:

Question PatternQuery Pattern
Who is the director of the movie [MOVIE]? SELECT ?directorName WHERE { ?movie qmo:name "[MOVIE]" . ?movie qmo:hasDirector ?person . ?person qmo:name ?directorName . }
Who wrote the screenplay for the movie [MOVIE]? SELECT ?writerName WHERE { ?movie qmo:name "[MOVIE]" . ?movie qmo:hasWriter ?person . ?person qmo:name ?writerName . }
Where can I see the movie [MOVIE]? SELECT ?cinemaName WHERE { ?movie qmo:name "[MOVIE]" . ?cinema qmo:showsMovie ?movie . ?cinema qmo:name ?cinemaName . }
In which cinema in [CITY] can I see the movie [MOVIE]? SELECT ?cinemaName WHERE { ?movie qmo:name "[MOVIE]" . ?cinema qmo:showsMovie ?movie . ?cinema qmo:isInCity "[CITY]" . ?cinema qmo:name ?cinemaName . }
Which movies can I see in [CINEMA]? SELECT ?movieName WHERE { ?cinema qmo:name "[CINEMA]" . ?cinema qmo:showsMovie ?movie . ?movie qmo:name ?movieName . }


As the QALL-ME Framework assumes the answer data to be represented in RDF, it is only natural to use SPARQL as the database query language; see also the AnswerPool component in section 2.3.2: “System Components”. Note, however, that in the examples above the SPARQL queries are not well-formed and highly simplified. Additionally the examples are using vocabulary from an imaginary RDF schema.

Observe the placeholders like [MOVIE] or [CITY] in each part of the mappings: for each placeholder in the question pattern of every mapping there is a corresponding placeholder in the SPARQL query pattern.

With the above mappings and the English RTE component we are now ready to answer the first question. Imagine the user has asked the following:

“Which cinemas show the movie Dreamgirls tonight?”

In the first step we replace all entities with placeholders and remember the replaced entities; this yields a pattern of the input question which is the input text T to the RTE component:

“Which cinemas show the movie [MOVIE] tonight?” ([MOVIE] = “Dreamgirls”)

The RTE component now has a text to work with. What’s still missing is a hypothesis H: this hypothesis is set to the question pattern of the first of our pattern mappings, then to the second and so on until we find a question pattern (H) which is entailed by the pattern we have created from our input question (i.e., the text T). In our example we can stop with the third mapping of the above list: “Which cinemas show the movie [MOVIE] tonight?” (T) textually entails “Where can I see the movie [MOVIE]?” (H). Therewith we now know the SPARQL query pattern that will be used to find the answer to the original question:

SELECT ?cinemaName WHERE { ?movie qmo:name "[MOVIE]" . ?cinema qmo:showsMovie ?movie . ?cinema qmo:name ?cinemaName . }

In a last step we now only need to replace all placeholders in the query pattern with the entities that we have remembered for our input question and get a complete SPARQL query that can be used to find the answer to the original question:

SELECT ?cinemaName WHERE { ?movie qmo:name "Dreamgirls" . ?cinema qmo:showsMovie ?movie . ?cinema qmo:name ?cinemaName . }

The QALL-ME Framework is based on a Service Oriented Architecture (SOA). Such an architecture enables distributed computing and enforces loose-coupling of system components, the so-called services. Every service can be seen as a “black box” with a specialized functionality which is accessible only via standardized interfaces. An orchestration of these services creates a so-called business process. In the QALL-ME Framework the business process is always a QA system; the service components are for example entity annotators, query generators or textual entailment recognizers. Orchestrations of different service implementations create different QA systems and existing service implementations can easily be reused in different systems.

The SOA of the QALL-ME Framework is realized using web service (WS) technology. In particular the framework is built around WSs specified according to the WSDL 1.1 standard which has been developed by the W3C. WSDL-based WS implementations can be made in any programming language; their intercommunication is completely realized with XML messages according to the SOAP protocol. For each QA component in the framework there is a WSDL description. Implementations of these descriptions can be dynamically orchestrated with a special WS component that we call the QA planner. As the QA planner is a WS itself, it can be easily included in larger applications, for example into a website or a mobile application which makes use of a certain QA system.

See section 2.3: “System Architecture” for the actual architecture behind the QALL-ME Framework with all its WS components and their default orchestration.

[1] Here and in the following “schema” does not (only) refer to RDF Schema but to any definition of a schema which is represented with RDF triples. This includes RDF Schema definitions but also OWL ontologies and others.

[2] … which are out of scope for this manual and thus shall not be topic …