Machine Learning and Natural Language

Spring 2011

Course Information

Meeting Times and Locations:

Lecture:  Wednesday/Friday, 9:30-10:45 1302 Siebel Center

Dan Roth

Office:Dan Roth - 3322 Siebel Center

Office Hours (Dan Roth): Just after class, or request an appointment by email

Phone: (217) 244-7068

E-mail: danr at Illinois dot edu

Course Description

Making decisions in natural language processing problems often involves assigning values to sets of interdependent variables where the expressive dependency structure can influence, or even dictate, what assignments are possible. Structured learning problems such as semantic role labeling provide one such example, but the setting is broader and includes a range of problems such as name entity and relation recognition and co-reference resolution. The setting is also appropriate for cases that may require a solution to make use of multiple models (possible pre-designed or pre-learned components) as in summarization, textual entailment and question answering.
This semester, we will devote the course to the study of structured learning problems in natural language processing. We will start by recalling the ``standard" learning formulations as used in NLP, move to formulations of multiclass classification and from then on focus on models of structure predictions and how they are being used in NLP.

Through lectures and paper presentations this course will introduce some of the central learning frameworks and techniques that have emerged in this area over the last few years, along with their application to multiple problems in NLP and Information Extraction.

    • Models: We will present both discriminative models such as structured Perceptron and Structured SVM, Probabilistic models and Constrained Conditional Models.
    • Training Paradigms: Joint Learning models; Decoupling learning from Inference; Constrained Driven Learning; Semi-Supervised Learning of Structure; Indirect Supervision
    • Inference: Constrained Optimization Models, Integer Linear Programming, Approximate Inference, Dual Decomposition.

Tentative Course Plan


CS446 or equivalent is required. A course in NLP or knowledge of relevant material is recommended.

Course Materials

I will not follow a text book. Relevant papers and notes will be available from the course home page. The following texts are listed only as background reading.

    • Daniel Jurafsky and James H. Martin, Speech and Language Processing , Prentice Hall 2008 (Second edition)
    • Christopher D. Manning and Hinrich Schutze, Foundations of Statistical Natural Language Processing , MIT Press 1999
    • Eugene Charniak, Statistical Language Learning, MIT Press 1993
    • Frederick Jelinek, Statistical Methods for Speech Recognition, MIT Press 1998
    • Steve Young and G. Bloothooft (Eds), Corpus-Based Methods in Language and Speech Processing, Kluwer
    • James Allen, Natural Language Understanding, Addison-Wesley


    There will be (1) scribe assignments for the lectures (2) reading assignments along with a few short critical surveys (3) at least one presentation (ideally, more). There is no final exam.
    Reading and Presentations: Mandatory readings and additional recommended readings will be assigned every week.
    • Four (4) times a semester you will write a short critical essay on one of the additional readings.
    • Once or twice you will present a paper from the additional readings (30 min, focusing on the technical details of the paper.).


    This is an advanced course. I view my role as guiding you through the material and helping you in your first steps as an researcher. I expect that your participation in class, reading assignments and presentations will reflect independence, mathematical rigor and critical thinking.