Machine Learning and Natural Language
Fall 2008
| Experimental Assignment II
Textual Entailment |
(Due
4/25/08 and 2/5/08) |
General
- The assignment will be done in teams. There are 5
participating teams (no Preparation and Evaluation Team
this time)
- Your final report on the assignment is
due on Friday, May 2nd. At this time, each team will
turn in its results along with a short report
describing what you did, what were the difficulties and
what are your conclusions.
- There will be a meeting devoted to the presentation of
the results on Monday 5/5/08, 10am, 3405SC, taking the
place of a final exam in the class.
-
Feel free to send me e-mail or come to ask questions.
Textual Entailment
Textual Entailment is the task of determining, for example,
that the sentence: ``WalMart defended itself in court today
against claims that its female employees were kept out of jobs in
management because they are women'' entails that `` WalMart
was sued for sexual discrimination''.
Determining whether the meaning of a given text snippet entails
that of another or whether they have the same meaning is
a fundamental problem in natural language understanding that
requires the ability to abstract over the inherent syntactic and
semantic variability in natural language.
This challenge is at the heart of many high level natural language
processing tasks including Question Answering, Information
Retrieval and Extraction, Machine Translation, and others that
attempt to reason about and capture the meaning of linguistic
expressions.
This task has been defined only recently, but it's a pretty hot area of research.
Recommended Reading on Textual Entailment:
-
Textual Entailment Portal
- Proceeding
of the first "Recognizing Textual Entailment" Challenge
2005
- Proceeding
of the second "Recognizing Textual Entailment"
Challenge 2006
- Proceeding
of the third "Recognizing Textual Entailment" Challenge
2007
- Oren Glickman, Ido Dagan and Moshe Koppel. , A Probabilistic
Classification Approach for Lexical Textual Entailment,
AAAI (2005)
- R. Braz, R. Girju, V. Punyakanok, D. Roth, and M.
Sammons
, An Inference Model for Semantic Entailment and
Question-Answering. AAAI (2005)
- Rajat Raina, Andrew Y. Ng, and Christopher Manning
, Robust textual inference via learning and abductive
reasoning. AAAI (2005).
- Also see an invited presentation
I gave in a workshop on Textual Entailment at ACL-05.
- A tutorial on Textual
Entailment. Given by Ido Dagan, Dan Roth and Fabio
Zanzotto at ACL-2007
The 13th Conference of the Association of Computation
Linguistics.
The goal of this assignment is for you to think about and
implement a strategy for deciding Textual Entailment. Given a
pair of sentences, t and h, you want to determine
if t entails h. The goal is to return True
(entails) or False, along with a confidence in this
decision.
The Assignment
You will be given two sets of collections of pairs of
(t,h)). The first set consists of development
sets for the three RTE challanges; you can use it to
develop your strategy; look at the data, study it in different
ways, train classifiers on it if you'd like, etc.
The second collection of pairs, the test set, consists of
the test sets of all three RTE challenges. Your goal is to
achieve good results on the test sets. But, you will evaluate
and report your results both on development and test, and
report separately on each of the RTE collections.
Both collections are annotated with a task and with
the entailment classification (True/False); needless to say,
the annotation of the test set will be used only for
evaluating your results.
Data The three development sets and the three tests sets
are available here.
Please search for RTE and you will find development and test
data for RTE{1,2,3}
In addition to the raw sentence pairs (in an xml files) the
data has been processed by a semantic role labeling program,
and a few other tools, and is available in a "column format".
This
file explains the column format.
An evaluation script is available here.
Please note that
this was processed with an earlier version of the semantic role
labeling; you can process data yourself, via this
tool. or any other tools.
Some Tools
- Before you start using the preprocessed (column
format) data, I would like you to develop a baseline
that is purely lexical.
- You can use the following lexical similarity package, written by Quang Do
and build a baseline lexical textual entailment on top
of it. Please consult the Readme file.
- Notice that the data provided to you already contains
a lot of information in the column format
representation, including part of speech data, named
entity data and semantic role labeling information. This file explains the column format.
Other resources can be made available. For example
the following script allows access to a Number and Quantities demo, that you can also
look at here.
Deliverables
- You first assignment is a baseline system that runs
only lexical textual entailment. You can use the
similarity tool provided or develop your own. Please
email me on or before April 25 a short report
(no longer than 2 pages; 11font; pdf file) that
tabulates the experiments you have done and explains
what has been done. Please report results separately
for each of the RTEs and separately for development and
test. Make sure to look at the data and discuss the
results you observe. This will later be a section in
your final report.
- For your final product, design three different
versions of your entailment system, starting with
your baseline system, and moving to (at least) two
more sophisticated approaches.
- You must use some external resources (web,
wordnet, corpora, etc.) and some preprocessing
tools.
Experiment with your systems, and compare them both
globally and on each of the tasks separately.
report results on all the data given to you.
Report
-
Describe what you did, the specifics of your resources, algorithms and experiments.
-
Conclude with some suggestions for improvements, future work, etc.
Send me only your report (no longer than 10 pages; 11 font;
pdf file) but be ready with a package of the code in case
you need to show something about it.
Grading
Your grade depends on:
-
The quality of your report
- The quality of your results.
- Your originality in going beyond the minimal requirements.
- The quality of your final presentaiton.
Due date
Friday, May. 2nd. Presentation on May
5th.
Dan
Roth