Learning Coherent Concepts: Theory and Applications to Natural Language
Supported by NSF
Period: 1999-2002
This research seeks to develop an integrated view - theoretical understanding, algorithms development and experimental evaluation - for learning coherent concepts. These are learning scenarios that are common in cognitive learning - where multiple learners co-exist and may learn different functions on the same input, but there are mutual compatibility constraints on their outcomes. Our effort will consist of developing a learning theory for these situations and of studying algorithmic ways to exploit them in natural language inferences.
The theoretical study will concentrate on developing a semantics for the coherency conditions and study it from a learning theory point of view. The goal is to understand in what ways does learning become easier and more robust in these situations. The algorithmic study will concentrate on developing ways to exploit coherency and will have a significant experimental component, using the problem of shallow parsing as a testbed for investigating chaining of coherent classifiers and inferences that rely on the outcomes of several classifiers.
This research would have a significant impact on theoretical research in learning and on our ability to perform higher level inference in natural language. First, it would help to resolve the contrast between the predicted hardness of learning and the apparent ease at which cognitive systems learn. Moreover will provide an understanding of how to exploit coherency in order to develop better learning and inference methods for these situations and would result in an integrated learning approach to a variety of shallow parsing tasks, implemented and demonstrated in a large scale manner using the SNoW learning architecture. Finally, incorporating the understanding of interacting classifiers as well as methods to perform inferences that rely on several classifiers into our learning system would be directly applicable to a variety of other tasks in this domain.
Relevant Publications:
- E. Daya, D. Roth, and S. Wintner, Learning Hebrew Roots: Machine Learning with Linguistic Constraints. Computational Linguistics (2008) [bibitem]
- R. de Salvo Braz, E. Amir, and D. Roth, Lifted First-order Probabilistic Inference. Proc. of the International Joint Conference on Artificial Intelligence (IJCAI) (2005) pp. 1319--1125 (abstract) [bibitem]
- R. Khardon, D. Roth, and R. Servedio, Efficiency versus Convergence of Boolean Kernels for On-Line Learning Algorithms. Journal of Machine Learning Research (2005) pp. 341-356 [bibitem]
- P. Fung and D. Roth, Guest Editors Introduction: Machine Learning in Speech and Language Technologies. Machine Learning (2005) pp. 1-6 [bibitem]
- V. Punyakanok and D. Roth, Inference with Classifiers: The Phrase Identification Problem. Computational Linguistics (2005) [bibitem]
- E. Daya, D. Roth, and S. Wintner, Learning Hebrew Roots: Machine Learning with Linguistic Constraints. EMNLP (2004) pp. 168--178 (abstract) [bibitem]
- V. Punyakanok, D. Roth, W. Yih, D. Zimak, and Y. Tu, Semantic Role Labeling via Generalized Inference over Classifiers Shared Task Paper. Proc. of the Annual Conference on Computational Natural Language Learning (CoNLL) (2004) pp. 130--133 (abstract) [bibitem]
- V. Punyakanok, D. Roth, W. Yih, D. Zimak, and Y. Tu, Semantic Role Labeling via Generalized Inference over Classifiers Shared Task Paper. Proc. of the Annual Conference on Computational Natural Language Learning (CoNLL) (2004) pp. 130--133 (abstract) [bibitem]
- S. Agarwal, A. Awan, and D. Roth, Learning to Detect Objects in Images via a Sparse, Part-Based Representation. IEEE Transactions on Pattern Analysis and Machine Intelligence (2004) pp. 1475--1490 (abstract) [bibitem]
- S. Har-Peled, D. Roth, and D. Zimak, Constraint Classification for Multiclass Classification and Ranking. The Conference on Advances in Neural Information Processing Systems (NIPS) (2003) pp. 785--792 [bibitem]
- C. Cumby and D. Roth, On Kernel Methods for Relational Learning. Proc. of the International Conference on Machine Learning (ICML) (2003) pp. 107--114 (abstract) [bibitem]
- A. Garg and D. Roth, Margin Distribution and Learning Algorithms. Proc. of the International Conference on Machine Learning (ICML) (2003) pp. 210--217 (abstract) [bibitem]
- C. Cumby and D. Roth, Feature Extraction Languages for Propositionalized Relational Learning. IJCAI Workshop on Learning Statistical Models from Relational Data (2003) (abstract) [bibitem]
- R. de Salvo Braz and D. Roth, Functional subsumption in Description Logics. NIPS Workshop on Feature Extraction and Selection (2003) [bibitem]
- R. Khardon, D. Roth, and R. Servedio, Efficiency versus Convergence of Boolean Kernels for On-Line Learning Algorithms. The Conference on Advances in Neural Information Processing Systems (NIPS) (2001) [bibitem]
- A. Garg, S. Har-Peled, and D. Roth, On generalization bounds, projection profile, and margin distribution. ICML (2002) pp. 171--178 (abstract) [bibitem]
- C. Cumby and D. Roth, Learning with Feature Description Logics. Proc. of the International Conference Inductive Logic Programming (2002) pp. 32-47 (abstract) [bibitem]
- X. Carreras, L. M`arquez, V. Punyakanok, and D. Roth, Learning and Inference for Clause Identification. Proc. of the European Conference on Machine Learning (ECML) (2002) pp. 35--47 (abstract) [bibitem]
- S. Har-Peled, D. Roth, and D. Zimak, Constraint Classification: a new approach to Multiclass Classification. Proc. of the International Workshop on Algorithmic Learning Theory (ALT) (2002) pp. 135--150 (abstract) [bibitem]
- S. Agarwal and D. Roth, Learning a Sparse Representation for Object Detection. Proc. of the European Conference on Computer Vision (ECCV) (2002) pp. 113--128 (abstract) [bibitem]
- R. Greiner, A. Grove, and D. Roth, Learning Cost-Sensitive Active Classifiers. Artificial Intelligence (2002) pp. 137--174 (abstract) [bibitem]
- V. Punyakanok and D. Roth, The Use of Classifiers in Sequential Inference. The Conference on Advances in Neural Information Processing Systems (NIPS) (2001) pp. 995--1001 (abstract) [bibitem]
- D. Roth, Reasoning with Classifiers. Proc. of the European Conference on Machine Learning (ECML) (2001) pp. 506--510 [bibitem]
- A. Garg and D. Roth, Understanding Probabilistic Classifiers. Proc. of the European Conference on Machine Learning (ECML) (2001) pp. 179--191 (abstract) [bibitem]
- A. Garg and D. Roth, Learning Coherent Concepts. Proc. of the International Workshop on Algorithmic Learning Theory (ALT) (2001) pp. 135--150 (abstract) [bibitem]
- J. Chuang and D. Roth, Gene Recognition based on DAG shortest paths. The International Conference on Intelligent Systems for Molecular Biology (2001) pp. 1-9 (abstract) [bibitem]
- J. Chuang and D. Roth, Gene Recognition based on DAG shortest paths. Bioinformatics (2001) pp. S56--S64 (abstract) [bibitem]
- A. Carlson, J. Rosen, and D. Roth, Scaling Up Context Sensitive Text Correction. Proceedings of the National Conference on Innovative Applications of Artificial Intelligence (IAAI) (2001) pp. 45--50 (abstract) [bibitem]
- M. Yang, D. Roth, and N. Ahuja, Face Detection Using Large Margin Classifiers. Proceedings of IEEE International Conference on Image Processing (2001) pp. 665--668 (abstract) [bibitem]
- A. Grove and D. Roth, Linear concepts and hidden variables. Machine Learning (2001) pp. 123--141 (abstract) [bibitem]