4.2.3. Language Specific Components

Language specific components in the QALL-ME Framework are components which work on specific aspects of natural language texts – more specifically questions in our case. In principle there has to be an extra implementation of each of these components for every natural language that shall be supported by the final question answering (QA) system.[10]

Language specific component implementations do not belong to the core QALL-ME Framework. However, some demo component implementations for different languages are provided as examples in Java – see section 4.4: “Demo Components Description” for details.

The annotateTerms web method annotates language specific terms in a given sentence with language independent forms. A term might be a movie genre or some facility in a hotel. For example, “Komödie” is the German term for a comedy; in Spanish it is “comedia”. These two language specific terms might be annotated by two TermAnnotator implementations with a common, canonical form such as “comedy”. The input text which is to be annotated is an <AnnotatedSentence> (cf. 4.2.1: “General Remarks”); annotated parts of the input will not be annotated again.

Sample request (SOAP body):

<soapenv:Body xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
	<AnnotatedSentence xmlns="http://qallme.sf.net/xsd/qallmeshared.xsd"
		>Wo kann ich heute eine Komödie sehen?</AnnotatedSentence>
</soapenv:Body>

Sample response (SOAP body):

<soapenv:Body xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
	<AnnotatedSentence xmlns="http://qallme.sf.net/xsd/qallmeshared.xsd"
		>Wo kann ich heute eine <annotation canonicalForm="comedy" type="GENRE"
		>Komödie</annotation> sehen?</AnnotatedSentence>
</soapenv:Body>

The annotateTime web method annotates language specific temporal expressions in a given sentence with language independent canonical forms in the extended TIMEX2 format (cf. 4.2.1: “General Remarks”). A temporal expression might be “today”, “next week”, “at 11 p.m.”, etc. The input text which is to be annotated is an <AnnotatedSentence> (again, see 4.2.1); annotated parts of the input will not be annotated again. In addition to the text to annotate, a temporal context has to be provided: this context is the exact time point relative to which relative temporal expressions shall be normalized to an absolute time point. Examples for such relative temporal expressions in English are “tomorrow”, “in two hours” or “next weekend”.

Sample request (SOAP body):

<soapenv:Body xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
	<tim:AnnotationRequest xmlns:tim="http://qallme.sf.net/wsdl/timeannotation.wsdl">
		<AnnotatedSentence xmlns="http://qallme.sf.net/xsd/qallmeshared.xsd"
			>Wo kann ich heute eine <annotation canonicalForm="comedy" type="GENRE"
			>Komödie</annotation> sehen?</AnnotatedSentence>
		<temporalContext>2007-03-18T12:58:21</temporalContext>
	</tim:AnnotationRequest>
</soapenv:Body>

Sample response (SOAP body):

<soapenv:Body xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
	<AnnotatedSentence xmlns="http://qallme.sf.net/xsd/qallmeshared.xsd"
		>Wo kann ich <annotation canonicalForm="&lt;TIMEX2 VAL=&quot;2007-03-18&quot;>heute&lt;/TIMEX2>" type="TIMEX2"
		>heute</annotation> eine <annotation canonicalForm="comedy" type="GENRE"
		>Komödie</annotation> sehen?</AnnotatedSentence>
</soapenv:Body>
The relative temporal expression “heute” (“today”) has been normalized to an absolute date using the temporal context of the annotated text.

The QueryGenerator is a WS which – with its main web method – generates a SPARQL “CONSTRUCT” query for an annotated natural language question. Internally this main web method calls an EntailmentTester (cf. 4.2.3.4) and a Timex2SparqlConverter (cf. 4.2.2.3); the WSDL locations of the implementations of these two web services can be set and queried with further web methods. The QueryGenerator’s WSDL description can be found under src/main/net/sf/qallme/res/wsdl/querygeneration.wsdl. Check out section 2.1.5: “Recognizing Textual Entailment for QA” for general information on how an implementation of the QueryGenerator is supposed to be made.

The QueryGenerator has the following web methods:

The generateSPARQLQuery web method generates a SPARQL “CONSTRUCT” query for an annotated natural language question. There can be annotations for language specific terms, temporal expressions and location specific entities in the form of an <AnnotatedSentence> element (cf. 4.2.1: “General Remarks”). Additionally a spatiotemporal context parameter has to be given which is intended to be used to anchor SPARQL queries which otherwise would not contain any spatial or temporal restrictions: for a question like “Where can I see Casino Royale?” we probably don’t want to return all cinemas that have shown or will ever show the movie but only those cinemas which are showing the movie in the next days.

The SPARQL “CONSTRUCT” queries that implementations of the generateSPARQLQuery web method have to generate should have the following properties: the RDF graph that the generated query describes and which is eventually created by the AnswerPool WS should contain the answer, the expected answer type, the context of the answer in form of the information from the question and possibly further information that may support the later presentation of the answer. Such answer presentation supporting information might be for example geocoordinates of a cinema or the full address of a hotel. From this RDF some presentation component (which is not part of the QALL-ME Framework) could then take exactly that information which it deems appropriate for answer presentation. See the description of the answer graph structure of the demo implementations in section 4.4.1.2 for a recommendation for a useful RDF graph structure.

Sample request (SOAP body):

<soapenv:Body xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
	<quer:generateSPARQLQuery xmlns:quer="http://qallme.sf.net/wsdl/querygeneration.wsdl"
	                          xmlns:qal="http://qallme.sf.net/xsd/qallmeshared.xsd">
		<queryGenRequest>
			<qal:AnnotatedSentence
				>Wo kann ich <qal:annotation canonicalForm="&lt;TIMEX2 VAL=&quot;2007-03-18&quot;&gt;heute&lt;/TIMEX2&gt;" type="TIMEX2"
				>heute</qal:annotation> in <qal:annotation canonicalForm="Schmelz" type="DESTINATION"
				>Schmelz</qal:annotation> <qal:annotation canonicalForm="Dreamgirls" type="MOVIE"
				>Dreamgirls</qal:annotation> sehen?</qal:AnnotatedSentence>
			<qal:spatiotemporalContext>
				<qal:location lat="49.256409" lon="7.042437">DE</qal:location>
				<qal:time>2007-03-18T12:58:23</qal:time>
			</qal:spatiotemporalContext>
		</queryGenRequest>
	</quer:generateSPARQLQuery>
</soapenv:Body>
The spatial part in the spatiotemporal context of the above sample request, i.e., the <location> element, specifies the exact location of the user who has posed the given question. This position is represented as geocoordinates in the form of latitude and longitude values. The location string (“DE” in the example) can actually be anything; however, it is recommended to use the ISO 3166-1 alpha-2 country code of the user’s location here in order to support the location identification if the reverse geocoding on the basis of the given coordinates should not be possible. The temporal part of the question context is a simple time value.

Sample response (SOAP body):

<soapenv:Body xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
	<string xmlns="http://qallme.dfki.de/wsdl/qallmeshared.wsdl"><![CDATA[
			PREFIX qmo: <http://qallme.itc.it/ontology/qallme-tourism.owl#>
			PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
			PREFIX qma: <http://qallme.fbk.eu/ontology/qallme-answers.owl#>
			PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
			CONSTRUCT {
			?cinema qmo:name ?cinemaName ;
			        a qmo:Cinema ;
			        qmo:hasPostalAddress ?pAddr ;
			        qmo:hasGPSCoordinate ?gps .
			?pAddr qmo:isInDestination ?dest ; 
			       qmo:street ?street .
			?dest qmo:name ?destName .
			?gps qmo:longitude ?longitude ;
			     qmo:latitude ?latitude ;
			     qmo:statusOfGPSCoordinate ?gpsStat .
			?movie qmo:name ?movieName ;
			       a qmo:Movie .
			?event qmo:isInSite ?cinema ;
			       qmo:hasEventContent ?movie ;
			       qmo:hasPeriod ?period .
			?period qmo:hasDatePeriod ?datePeriod ;
			        a qmo:DateTimePeriod ;
			        qmo:hasTimePeriod ?timePeriod .
			?datePeriod qmo:startDate ?date .
			?timePeriod qmo:startTime ?time .
			qma:AnswerInstance a qma:AnswersObject ;
			                   qma:hasAnswerValue ?cinema .
			}
			WHERE {
			?cinema qmo:name ?cinemaName ;
			        a qmo:Cinema ;
			        qmo:hasPostalAddress ?pAddr .
			?pAddr qmo:isInDestination ?dest .
			OPTIONAL { ?pAddr qmo:street ?street . } .
			?dest qmo:name ?destName .
			?movie qmo:name ?movieName ;
			       a qmo:Movie .
			?event qmo:isInSite ?cinema ;
			       qmo:hasEventContent ?movie ;
			       qmo:hasPeriod ?period .
			?period qmo:hasDatePeriod ?datePeriod ;
			        a qmo:DateTimePeriod ;
			        qmo:hasTimePeriod ?timePeriod .
			?datePeriod qmo:startDate ?date .
			?timePeriod qmo:startTime ?time .
			OPTIONAL { ?cinema qmo:hasGPSCoordinate ?gps .
			           ?gps qmo:longitude ?longitude ;
			                qmo:latitude ?latitude .
			           OPTIONAL { ?gps qmo:statusOfGPSCoordinate ?gpsStat . } . } .
			FILTER(?destName = "Schmelz"^^xsd:string) .
			FILTER(?movieName = "Dreamgirls"^^xsd:string) .
			FILTER (((xsd:dateTime("2007-03-18T00:00:00") <= xsd:dateTime(fn:string-join(fn:string-join(xsd:string(?date),"T"),xsd:string(?time)))) && (xsd:dateTime("2007-03-19T06:00:00") >= xsd:dateTime(fn:string-join(fn:string-join(xsd:string(?date),"T"),xsd:string(?time)))))).
			}]]></string>
</soapenv:Body>
The last “FILTER” expression in the generated SPARQL query has been created from the TIMEX2 annotated temporal expression in the input by the internally set Timex2SparqlConverter (see also the setTimex2SparqlConverter web method).

The setEntailmentTester web method sets or resets the EntailmentTester implementation which is internally used when calling the generateSPARQLQuery web method. This web service (re)setting should not be used in production systems as it is potentially problematic – not only security-wise. For example, a client cannot be sure that the EntailmentTester it has just set is still valid when invoking the generateSPARQLQuery web method; in the meantime another client might have reset the used EntailmentTester already.

As with setting used WS implementations in the QAPlanner, a WS implementation is always specified by its WSDL document URL.

Sample request (SOAP body):

<soapenv:Body xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
	<quer:setEntailmentTester xmlns:quer="http://qallme.sf.net/wsdl/querygeneration.wsdl">
		<uri>http://localhost:8080/qmfdemo/de-entailmenttest/EntailmentTesterWS?wsdl</uri>
	</quer:setEntailmentTester>
</soapenv:Body>
This sample request sets the given EntailmentTester implementation as the internally used one for calls to the QueryGenerator’s generateSPARQLQuery web method.

Sample response (SOAP body):

<soapenv:Body xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
	<ns2:setEntailmentTesterResponse xmlns:ns2="http://qallme.sf.net/wsdl/querygeneration.wsdl">
		<dummy xsi:type="xs:string" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
	</ns2:setEntailmentTesterResponse>
</soapenv:Body>

The setTimex2SparqlConverter web method sets or resets the Timex2SparqlConverter implementation which is internally used when calling the generateSPARQLQuery web method. This web service (re)setting should not be used in production systems as it is potentially problematic – not only security-wise. For example, a client cannot be sure that the Timex2SparqlConverter it has just set is still valid when invoking the generateSPARQLQuery web method; in the meantime another client might have reset the used Timex2SparqlConverter already.

As with setting used WS implementations in the QAPlanner, a WS implementation is always specified by its WSDL document URL.

Sample request (SOAP body):

<soapenv:Body xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
	<quer:setTimex2SparqlConverter xmlns:quer="http://qallme.sf.net/wsdl/querygeneration.wsdl">
		<uri>http://localhost:8080/qmfdemo/timex2sparqlconversion/Timex2SparqlConverterWS?wsdl</uri>
	</quer:setTimex2SparqlConverter>
</soapenv:Body>
This sample request sets the given Timex2SparqlConverter implementation as the internally used one for calls to the QueryGenerator’s generateSPARQLQuery web method.

Sample response (SOAP body):

<soapenv:Body xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
	<ns2:setTimex2SparqlConverterResponse xmlns:ns2="http://qallme.sf.net/wsdl/querygeneration.wsdl">
		<dummy xsi:type="xs:string" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
	</ns2:setTimex2SparqlConverterResponse>
</soapenv:Body>

The EntailmentTester is a WS which tests for one or more pairs of annotated sentences whether the first sentence of each pair entails the second sentence of that pair or to what degree this entailment holds. The EntailmentTester’s WSDL description can be found under src/main/net/sf/qallme/res/wsdl/entailmenttest.wsdl.

The EntailmentTester has the following web methods:

Implementing a good textual entailment engine is not a trivial task. You might want to consider trying out the free, language independent EDITS system for your own EntailmentTester implementation. EDITS is a textual entailment recognition system which has its roots in the QALL-ME project – just like the QALL-ME Framework.

The getEntailmentProbability web method tests for one or more pairs of annotated sentences whether the first sentence of each pair entails the second sentence of that pair or rather to what degree this entailment holds. The “degree of entailment” is returned as probability values from 0 to 1.

Every sentence of each pair is represented by an extended version of the <AnnotatedSentence> element which we have introduced in 4.2.1: “General Remarks”: either the extended element has an id attribute or it has a ref attribute. <AnnotatedSentence>s with a ref attribute refer to sentences that have been defined with the ID which is provided as the value of the ref attribute; the EntailmentTester shall treat such <AnnotatedSentence>s that have been directly defined using an id attribute equally to such <AnnotatedSentence>s that have been defined using a ref attribute.

Sample request (SOAP body):

<soapenv:Body xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
	<ent:getEntailmentProbability xmlns:ent="http://qallme.sf.net/wsdl/entailmenttest.wsdl"
	                              xmlns:ent1="http://qallme.sf.net/xsd/entailmenttest.xsd"
                                  xmlns:ns2="http://qallme.sf.net/xsd/qallmeshared.xsd"
                                  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
		<pairs>
			<ent1:AnnotatedSentence id="no1" xsi:type="ent1:AnnotatedSentenceId"
				>Wo in <ns2:annotation canonicalForm="Schmelz" type="DESTINATION"
				>Schmelz</ns2:annotation> kann ich <ns2:annotation canonicalForm="&lt;TIMEX2 VAL=&quot;2007-03-18TEV&quot;>heute Abend&lt;/TIMEX2>" type="TIMEX2"
				>heute Abend</ns2:annotation> <ns2:annotation canonicalForm="Dreamgirls" type="MOVIE"
				>Dreamgirls</ns2:annotation> sehen?</ent1:AnnotatedSentence>
			<ent1:AnnotatedSentence id="no2" xsi:type="ent1:AnnotatedSentenceId"
				>In welchem Kino in <ns2:annotation canonicalForm="Schmelz" type="DESTINATION"
				>Schmelz</ns2:annotation> kann ich <ns2:annotation canonicalForm="&lt;TIMEX2 VAL=&quot;2007-03-18TEV&quot;>heute Abend&lt;/TIMEX2>" type="TIMEX2"
				>heute Abend</ns2:annotation> <ns2:annotation canonicalForm="Dreamgirls" type="MOVIE"
				>Dreamgirls</ns2:annotation> sehen?</ent1:AnnotatedSentence>
			<ent1:AnnotatedSentence ref="no1" xsi:type="ent1:AnnotatedSentenceRef" />
			<ent1:AnnotatedSentence id="no3" xsi:type="ent1:AnnotatedSentenceId"
				>Wo kann ich in <ns2:annotation canonicalForm="Schmelz" type="DESTINATION"
				>Schmelz</ns2:annotation> <ns2:annotation canonicalForm="&lt;TIMEX2 VAL=&quot;2007-03-18TEV&quot;>heute Abend&lt;/TIMEX2>" type="TIMEX2"
				>heute Abend</ns2:annotation> den Film <ns2:annotation canonicalForm="Dreamgirls" type="MOVIE"
				>Dreamgirls</ns2:annotation> sehen?</ent1:AnnotatedSentence>
		</pairs>
	</ent:getEntailmentProbability>
</soapenv:Body>
Due to the ID reference in this example (ID “no1”), the first sentence of the first pair and the first sentence of the second pair are equal.

Sample response (SOAP body):

<soapenv:Body xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
	<ns3:getEntailmentProbabilityResponse xmlns:ns2="http://qallme.sf.net/xsd/entailmenttest.xsd"
	                                      xmlns:ns3="http://qallme.sf.net/wsdl/entailmenttest.wsdl">
		<probabilities>
			<ns2:probability>0.79949445</ns2:probability>
			<ns2:probability>0.8114218</ns2:probability>
		</probabilities>
	</ns3:getEntailmentProbabilityResponse>
</soapenv:Body>
For every pair of the input an entailment probability value is returned.

The entails web method tests for one or more pairs of annotated sentences whether or not the first sentence of each pair entails the second sentence of that pair. The input for the method is basically the same as for the getEntailmentProbability web method.

Sample request (SOAP body):

<soapenv:Body xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
	<ent:entails xmlns:ent="http://qallme.sf.net/wsdl/entailmenttest.wsdl"
	             xmlns:ent1="http://qallme.sf.net/xsd/entailmenttest.xsd"
	             xmlns:ns2="http://qallme.sf.net/xsd/qallmeshared.xsd"
	             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
		<pairs>
			<ent1:AnnotatedSentence id="no1" xsi:type="ent1:AnnotatedSentenceId"
				>Wo in <ns2:annotation canonicalForm="Schmelz" type="DESTINATION"
				>Schmelz</ns2:annotation> kann ich <ns2:annotation canonicalForm="&lt;TIMEX2 VAL=&quot;2007-03-18TEV&quot;>heute Abend&lt;/TIMEX2>" type="TIMEX2"
				>heute Abend</ns2:annotation> <ns2:annotation canonicalForm="Dreamgirls" type="MOVIE"
				>Dreamgirls</ns2:annotation> sehen?</ent1:AnnotatedSentence>
			<ent1:AnnotatedSentence id="no2" xsi:type="ent1:AnnotatedSentenceId"
				>In welchem Kino in <ns2:annotation canonicalForm="Schmelz" type="DESTINATION"
				>Schmelz</ns2:annotation> kann ich <ns2:annotation canonicalForm="&lt;TIMEX2 VAL=&quot;2007-03-18TEV&quot;>heute Abend&lt;/TIMEX2>" type="TIMEX2"
				>heute Abend</ns2:annotation> <ns2:annotation canonicalForm="Dreamgirls" type="MOVIE"
				>Dreamgirls</ns2:annotation> sehen?</ent1:AnnotatedSentence>
			<ent1:AnnotatedSentence ref="no1" xsi:type="ent1:AnnotatedSentenceRef" />
			<ent1:AnnotatedSentence id="no3" xsi:type="ent1:AnnotatedSentenceId"
				>Wo kann ich in <ns2:annotation canonicalForm="Schmelz" type="DESTINATION"
				>Schmelz</ns2:annotation> <ns2:annotation canonicalForm="&lt;TIMEX2 VAL=&quot;2007-03-18TEV&quot;>heute Abend&lt;/TIMEX2>" type="TIMEX2"
				>heute Abend</ns2:annotation> den Film <ns2:annotation canonicalForm="Dreamgirls" type="MOVIE"
				>Dreamgirls</ns2:annotation> sehen?</ent1:AnnotatedSentence>
		</pairs>
	</ent:entails>
</soapenv:Body>
Due to the ID reference in this example (ID “no1”), the first sentence of the first pair and the first sentence of the second pair are equal.

Sample response (SOAP body):

<soapenv:Body xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
	<ns3:entailsResponse xmlns:ns2="http://qallme.sf.net/xsd/entailmenttest.xsd"
	                     xmlns:ns3="http://qallme.sf.net/wsdl/entailmenttest.wsdl">
		<probabilities>
			<ns2:entails>true</ns2:entails>
			<ns2:entails>true</ns2:entails>
		</probabilities>
	</ns3:entailsResponse>
</soapenv:Body>
For every pair of the input a boolean value is returned which indicates whether the entailment for the respective pair holds or not.



[10] If a certain implementation of a language specific component type supports several different languages automatically, then there may as well be one implementation only, of course. However, this implementation has to be introduced several times to the QAPlanner using the setComponent web method once per supported language (cf. 4.2.2.2).