
Some of the resources below are copyrighted (marked with a '+') and require a password to access.
To use these resources, you must be a member of the LDC or have purchased
a license for the relevant resource. If you are a student at the University of Illinois at Urbana-Champaign and
need to use these resources, you may be covered by an existing license; check with your professor and then email
us to get access.
IT IS ILLEGAL TO SHARE A COPYRIGHTED LDC RESOURCE WITH PEOPLE OR ORGANIZATIONS WHO DO NOT HAVE EITHER AN LDC MEMBERSHIP OR A LICENSE
TO USE THE COPYRIGHTED RESOURCE.
The entailment corpora here are either in plain text, xml or column format. xml files contain the truth value of each sentence pair. Plain text files are accompanied by a .info file that contains the truth value in the same order as the sentence pairs in the associated corpus. This file explains the column format.
For a more extensive set of examples testing the kinds of phenomena modeled in the PARC dataset, take a look at the FRACAS dataset provided by Bill MacCartney of the Stanford University NLP Group.
The corpus and annotation guidelines developed for (V. Srikumar, R. Roichert, M. Sammons, A. Rappoport, and D. Roth, "Extraction of Entailed Semantic Relations Through Syntax-Based Comma Resolution", Proc. of the Annual Meeting of the ACL (2008)) can be downloaded here.
If you use this data, please cite the work referenced above.