
[ Overview | Details | Collaborations | Participants | Presentations | Funding | Publications ]
The Semantic Entailment task seeks to move Natural Language Processing towards Natural Language Understanding by defining its goals in terms of the meaning of natural language, namely recognizing when the meaning of one text span follows from the meaning of another. We believe that progress in Textual Entailment will allow us to better define an applied notion of Semantics for natural language. Moreover, progress towards a solution to the Semantic Entailment problem will have immediate applications in a wide range of NLP tasks, particularly in Question Answering.
Textual Entailment is that task of recognizing when the meaning of one utterance follows from the meaning of another.
One obvious application of a Textual Entailment system is to the task of Question Answering (QA). As the amount of information grows on the web, it becomes harder to find information. Research on Question Answering Systems aims at making the task of finding information easier. The goal is to replace current search technologies, which are based solely on key-word search, with the ability to process questions and find explicit answers for them. Textual Entailment offers a promising direction for QA systems, by allowing them to leverage the meaning of queries rather than just their individual terms.
Our approach to textual entailment leverages deep and shallow semantic content. Deep content is represented as interdependent predicate-argument structures, while shallow semantic content is encoded via similarity metrics between different types of text constituents such as words, multi-word expressions, and named entities. The two are unified in a machine learning and inference architecture. In addition, we are investigating methods that simplify text by identifying implicit relations and making them explicit. Our first step in this direction is our work on Comma Resolution.
A special case of Textual Entailment, Focused Textual Entailment in which the set of potential hypotheses is limited in their structure and domain. For example, when the hypothesis makes used of a limited set of relations and entities of interest. This restriction potentially allows focused development of resources supporting entailment
For the following entailment pair, most readers would agree that the Text entails the Hypothesis:
Text:
The three freshmen managed to enter the University's Student Union.Hypothesis:
Three people went into a building.
In the next example, most people would agree that the Text does NOT entail the Hypothesis:
Text:
The terrorists threatened to detonate a nuclear device capable of destroying a large city.Hypothesis:
The terrorists detonated a nuclear bomb.
This relaxed definition, proposed by the PASCAL Recognizing Textual Entailment Challenge, avoids some of the problems with classical logical entailment, while getting good interannotator agreement on the PASCAL RTE corpora. The definition does not commit to a categorization of knowledge required to determine entailment, but does allow for background knowledge that humans typically bring to bear on this problem -- i.e., it is not restricted to purely linguistic phenomena. As an example, most people would say that in the following example, the Text entails the Hypothesis:
Text:
John Smith spent six years in jail for his role in a number of violent armed robberies.Hypothesis:
John Smith was charged with two or more violent crimes.
Determining entailment in this example requires both background knowledge of the judicial process and the capacity to reason about numbers.
The Cognitive Computation Group's approach to Textual Entailment tackles the problem of background knowledge by encoding as much knowledge as possible as similarity metrics, while also modeling deep semantic structure in the underlying text. Our system induces a graph-based representation of input entailment pair text annotated with a variety of machine-learned tools (such as our Semantic Role Labeler and our Named Entity Recognizer). An inference procedure then compares constituents of the two text spans using the specialized comparison resources, and unifies them via a machine learning component trained on labeled Textual Entailment data.
An example of a Focused Textual Entailment domain is that of Document Anonymization. In this domain, guidelines specify a set of relations of interest (these are almost certainly in an abstract form); the task is to identify text in a corresponding set of documents that are entailed by these guidelines. One way to model this problem is as relation detection, where relation participants may be underspecified; for example, a guideline might specify:
"Any civilian charged with a violent crime."
where the domain is "reports of illegal activity".
For the above example, and given a corpus of documents representing reports (police reports, newspaper reports) we would like to scan the documents and detect such cases as:
"Bill Jones, headmaster of the Fisher School in Smalltown, Va., served six months in prison for striking pupil John Smith."
and reject such cases as:
"Bill Jones, Smallville County Clerk, was charged with illegally striking voters from the electoral roll."
We consider this to be an entailment task because shallow methods are unlikely to reliably detect cases where the relevant information needs to be pieced together, either from multiple text segments or via background knowledge.
We consider it to be a focused entailment task because there are a limited number of such guidelines, and the restricted domain permits development of specialized resources that could be expected to significantly enhance performance on this task (in this example, background knowledge of the judicial process, and of the distinction between civilian and non-civilian job titles, seem essential to good performance), which would have significantly less impact in a general Textual Entailment setting.
One of the long-term goals of this work is to develop a system that can be adapted by an end user to a particular domain, via an interactive interface that would allow them to identify missing resources and to enhance or correct existing resources. To this end, we are also enhancing the reports our system creates that explain the final entailment decision, with a view toward allowing an end user to use this output to correct mistakes or add new information.