Chapter 2. Framework Description

Chapter 2. Framework Description
Prev		Next

The QALL-ME Framework is an architecture skeleton for multilingual question answering (QA) systems that answer questions with the help of structured answer data sources from freely specifiable domains. The language barrier is crossed with the help of a domain ontology. Provisions are made to easily anchor questions in space and time. Finding the mapping between question and answer – which is the major step in all QA systems – is done using a novel approach which is based on textual entailment recognizers. The framework is based on a Service Oriented Architecture (SOA) which is realized using web services.

The QALL-ME Framework is free software and comes with a set of demo components which illustrate the potential of the approach and which help new developers to get started with the framework. The framework seeks to be compliant with standards as far as possible in order to enhance interoperability and ease of use.

If you feel that certain points of the implementations in the framework do not suit your needs, then you can always go ahead and build a custom implementation which is still based on the QALL-ME Framework idea. We believe that the general ideas behind the framework are a crucial aspect of QALL-ME which can be realized in lots of different ways – the QALL-ME Framework is just one way! In particular it is a way of providing a basic realization in the form of an architectural skeleton as described above. The aim of this particular way is to get you started with your own QA system relatively fast.

The following sections provide a closer look at the ideas and main features of the QALL-ME Framework as listed above. If you are more interested in the technical details of how the framework works, then you might also directly head over to section 2.3: “System Architecture”; please note, however, that this section assumes a basic knowledge of the ideas behind the QALL-ME Framework which is only explicitly provided here.

2.1.1. Multilingual QA

The QALL-ME Framework provides a skeleton for multilingual QA systems. Such QA systems support questions and answers in several different natural languages like German, Italian or English. The key point here is now that answers can also be in another language than the language of the question, i.e., answers can be retrieved crosslingually. Imagine for example a German tourist in Italy asking a question in German about accommodations with certain facilities in the region. A QA system based on the QALL-ME Framework could now return suitable answers which are retrieved from Italian data even though the language of the inquiry is different from the available data.

The question (source) and answer (target) languages of a QA system which is based on the QALL-ME Framework don’t need to be the same. It is well possible that the system supports more source languages than target languages or the other way round. The sets of supported source and target languages even needn’t overlap at all if such a scenario should be required.

The set of supported target languages usually corresponds to the main languages of the locations for which answer data is available. So the target language for a certain question can usually be inferred from the spatial context of the question, see also section 2.1.4: “Spatiotemporal Anchoring of Questions”.

See the upcoming section “Use of a Domain Ontology” to get an idea of how this crosslingual functionality is realized.

2.1.2. Structured Answer Data Sources

QA systems which are built with the QALL-ME Framework query structured data sources for suitable answers. Such data sources are usually databases of some kind but may also be simple XML documents with a certain structure. In any case, the used data structures have to be accessible via predefined RDF interfaces; in the simplest case the data is already available in the correct RDF schema^[1]. The concrete RDF schema which is used to represent answer data is specified by the domain ontology which is used. This implies that the answer data sources are always bound to a certain domain which, however, can be freely specified – as described in the next section.

2.1.3. Use of a Domain Ontology

In order to use the QALL-ME Framework, an ontology describing the domain on which the built QA system operates has to be provided: it contains concept descriptions for the target domain and descriptions of possible relations between these concepts. The ontology is then used in two ways: firstly, it is used to provide a schema to represent the structured answer data (cf. previous section). In other words, the answer data is described as instances of the ontology concepts using the vocabulary that the ontology provides. Secondly, the ontology is indirectly used to cross the language barrier in multilingual QA. This second way of using the ontology is actually just a nice side effect of the first way of ontology usage: by describing the answer data with the ontology vocabulary, we have a representation of the data which is independent of the original language of the data. We now only need to create a mapping from the question to a query which uses the ontology vocabulary, too; that query can then be applied to the answer data. The upcoming section “Recognizing Textual Entailment for QA” illustrates how this mapping is achieved in the QALL-ME Framework using textual entailment recognizers.

2.1.4. Spatiotemporal Anchoring of Questions

By default the QALL-ME Framework anchors questions in space and time. This means that all questions always have a spatial and temporal context. This should be natural: one can always use deictic expressions such as “here” or “tomorrow” in a question. Therefore, a question posed at eight o’clock in Berlin may potentially mean something completely different than the same question posed at five o’clock in Amsterdam; for example: “Where can I see a nice action movie nearby in about half an hour?”.

In multilingual settings (cf. section 2.1.1) the spatial anchoring is usually used to find the target language. The temporal anchor is also a crucial part in the QALL-ME Framework; more information about the usage of spatial and temporal anchoring of questions can be found in section 2.3: “System Architecture” with the description of the QA planner and the TimeAnnotator components.

2.1.5. Recognizing Textual Entailment for QA

Traditional QA approaches for structured answer data often made a deep analysis of the question in order to get a logical form which was then translated to a query which is suitable for application on the answer data. For various reasons^[2] this approach has been found to be inadequate or even infeasible. In the QALL-ME Framework we are addressing the mapping between natural language question and database query differently using RTE (Recognizing Textual Entailment) components. By using RTE components the problems of finding logical representations of questions and then mapping these to database queries is bypassed through semantic inference at the textual level.

An RTE component can recognize whether some text T entails a hypothesis H, i.e., whether the meaning of H can be fully derived from the meaning of T. In the context of the QALL-ME Framework H is always more or less a minimal form of a question about some topic and T is more or less the question which has to be answered. Through RTE we can now find out which minimal questions are contained in a given question. The set of minimal questions has to be defined in advance, i.e., each (minimal) question that shall be answerable has to be in the set. Through RTE we can then handle all reformulations of these minimal questions. One crucial part is still missing on our way to the answer – the step from a minimal question to the appropriate answer. In principal that’s easy: we simply attach the correct answers to each minimal question and return that answer if the minimal question is entailed by the user question.

As you may have guessed already, practically the above described process is a bit more complicated. Instead of using minimal natural language questions we use patterns of minimal natural language questions. To get these patterns we replace certain entities in the minimal questions with placeholders. Furthermore these patterns are not directly mapped to answers but rather to database query patterns containing the same placeholders as the question patterns. Here is how the question processing works then: we first replace all entities in the user question with placeholders. Next the RTE component tells us which of the (minimal) question patterns in our set is entailed by the question pattern we have just created from the input question. In the entailed pattern and the corresponding database query pattern we replace the respective placeholders with the entities that we had removed before from the original question. Now we have a complete database query which can be easily applied on the answer data.

Example

As the query generation step using RTE is the central part of the QALL-ME Framework, we should take the time to go through an example of this step. Our imaginary sample QA system supports English input questions, i.e., before it can answer any questions, there has to be an RTE component for English. Furthermore, a set of minimal question pattern to query pattern mappings has to be created for all English question types that shall be answerable. Let’s assume we have the following mappings:

Question Pattern	Query Pattern
Who is the director of the movie `[MOVIE]`?	`SELECT ?directorName WHERE { ?movie qmo:name "[MOVIE]" . ?movie qmo:hasDirector ?person . ?person qmo:name ?directorName . }`
Who wrote the screenplay for the movie `[MOVIE]`?	`SELECT ?writerName WHERE { ?movie qmo:name "[MOVIE]" . ?movie qmo:hasWriter ?person . ?person qmo:name ?writerName . }`
Where can I see the movie `[MOVIE]`?	`SELECT ?cinemaName WHERE { ?movie qmo:name "[MOVIE]" . ?cinema qmo:showsMovie ?movie . ?cinema qmo:name ?cinemaName . }`
In which cinema in `[CITY]` can I see the movie `[MOVIE]`?	`SELECT ?cinemaName WHERE { ?movie qmo:name "[MOVIE]" . ?cinema qmo:showsMovie ?movie . ?cinema qmo:isInCity "[CITY]" . ?cinema qmo:name ?cinemaName . }`
Which movies can I see in `[CINEMA]`?	`SELECT ?movieName WHERE { ?cinema qmo:name "[CINEMA]" . ?cinema qmo:showsMovie ?movie . ?movie qmo:name ?movieName . }`

Note

As the QALL-ME Framework assumes the answer data to be represented in RDF, it is only natural to use SPARQL as the database query language; see also the AnswerPool component in section 2.3.2: “System Components”. Note, however, that in the examples above the SPARQL queries are not well-formed and highly simplified. Additionally the examples are using vocabulary from an imaginary RDF schema.

Observe the placeholders like [MOVIE] or [CITY] in each part of the mappings: for each placeholder in the question pattern of every mapping there is a corresponding placeholder in the SPARQL query pattern.

With the above mappings and the English RTE component we are now ready to answer the first question. Imagine the user has asked the following:

“Which cinemas show the movie Dreamgirls tonight?”

In the first step we replace all entities with placeholders and remember the replaced entities; this yields a pattern of the input question which is the input text T to the RTE component:

“Which cinemas show the movie [MOVIE] tonight?” ([MOVIE] = “Dreamgirls”)

The RTE component now has a text to work with. What’s still missing is a hypothesis H: this hypothesis is set to the question pattern of the first of our pattern mappings, then to the second and so on until we find a question pattern (H) which is entailed by the pattern we have created from our input question (i.e., the text T). In our example we can stop with the third mapping of the above list: “Which cinemas show the movie [MOVIE] tonight?” (T) textually entails “Where can I see the movie [MOVIE]?” (H). Therewith we now know the SPARQL query pattern that will be used to find the answer to the original question:

“SELECT ?cinemaName WHERE { ?movie qmo:name "[MOVIE]" . ?cinema qmo:showsMovie ?movie . ?cinema qmo:name ?cinemaName . }”

In a last step we now only need to replace all placeholders in the query pattern with the entities that we have remembered for our input question and get a complete SPARQL query that can be used to find the answer to the original question:

“SELECT ?cinemaName WHERE { ?movie qmo:name "Dreamgirls" . ?cinema qmo:showsMovie ?movie . ?cinema qmo:name ?cinemaName . }”

2.1.6. Service Oriented Architecture

The QALL-ME Framework is based on a Service Oriented Architecture (SOA). Such an architecture enables distributed computing and enforces loose-coupling of system components, the so-called services. Every service can be seen as a “black box” with a specialized functionality which is accessible only via standardized interfaces. An orchestration of these services creates a so-called business process. In the QALL-ME Framework the business process is always a QA system; the service components are for example entity annotators, query generators or textual entailment recognizers. Orchestrations of different service implementations create different QA systems and existing service implementations can easily be reused in different systems.

The SOA of the QALL-ME Framework is realized using web service (WS) technology. In particular the framework is built around WSs specified according to the WSDL 1.1 standard which has been developed by the W3C. WSDL-based WS implementations can be made in any programming language; their intercommunication is completely realized with XML messages according to the SOAP protocol. For each QA component in the framework there is a WSDL description. Implementations of these descriptions can be dynamically orchestrated with a special WS component that we call the QA planner. As the QA planner is a WS itself, it can be easily included in larger applications, for example into a website or a mobile application which makes use of a certain QA system.

See section 2.3: “System Architecture” for the actual architecture behind the QALL-ME Framework with all its WS components and their default orchestration.

^[1]Here and in the following “schema” does not (only) refer to RDF Schema but to any definition of a schema which is represented with RDF triples. This includes RDF Schema definitions but also OWL ontologies and others.

^[2]… which are out of scope for this manual and thus shall not be topic …

Prev		Next
Chapter 1. Quick Tour: The QALL-ME Framework in 10 Points	Home	2.2. Prerequisites