Overview:
Recognizing Named entities and relations among them are essential sub-tasks of natural language understanding and have immediate applications in facilitating access to information.
We have developed machine learning and inference techniques for these tasks, focusing on exploiting the inter-dependencies between them as a way to improve performance on each of them.
We have developed a stand-alone, machine learning based, state of the art NE recognition tool, and also studied inference methods that make use of the inter-dependencies between the tasks. Our key approach, which has been developed and studied in the context of several NLP and IE problems is that of Constrained Conditional Models. We found that the use of constraints can significantly improve the performance of both tasks. We also developed this in the context of minimizing the amount of labeled data required for IE extraction tasks, including semi-supervised and active learning protocols.
Relevant Publications:
- D. Roth and K. Small, Interactive Feature Space Construction using Semantic Information. Proc. of the Annual Conference on Computational Natural Language Learning (CoNLL) (2009)
- D. Roth and K. Small, Active Learning for Pipeline Models. Proceedings of the National Conference on Artificial Intelligence (AAAI) (2008)
- D. Roth and W. Yih, Global Inference for Entity and Relation Identification via a Linear Programming Formulation. Introduction to Statistical Relational Learning (2007)
- A. Doan, X. Li, and D. Roth, MEDIATE: Learning to match entity mentions across text and data bases. (2006)
- X. Li, P. Morie, and D. Roth, Semantic Integration in Text: From Ambiguous Names to Identifiable Entities. AI Magazine. Special Issue on Semantic Integration (2005) pp. 45--68
- W. Shen, X. Li, and A. Doan, Constraint-Based Entity Matching. Proceedings of the National Conference on Artificial Intelligence (AAAI) (2005)
- X. Li and D. Roth, Discriminative Training of Clustering Functions: Theory and Experiments with Entity Identification. Proc. of the Annual Conference on Computational Natural Language Learning (CoNLL) (2005) pp. 64--71
- X. Li and D. Roth, Discriminative Training of Clustering Functions: Theory and Experiments with Entity Identification. The Second Midwest Computational Linguistics Colloquium (MCLC) (2005)
- X. Li, P. Morie, and D. Roth, Identification and Tracing of Ambiguous Names: Discriminative and Generative Approaches. Proceedings of the National Conference on Artificial Intelligence (AAAI) (2004) pp. 419--424
- X. Li, P. Morie, and D. Roth, Robust Reading: Identification and Tracing of Ambiguous Names. Proc. of the Annual Meeting of the North American Association of Computational Linguistics (NAACL) (2004) pp. 17--24
- D. Roth and W. Yih, A Linear Programming Formulation for Global Inference in Natural Language Tasks. Proc. of the Annual Conference on Computational Natural Language Learning (CoNLL) (2004) pp. 1--8
- X. Li, P. Morie, and D. Roth, Robust Reading of Ambiguous Writing. (2003)
- D. Roth and W. Yih, Probabilistic Reasoning for Entity and Relation Recognition. Proc. the International Conference on Computational Linguistics (COLING) (2002) pp. 835--841