Machine Learning and Natural Language

Spring 2011

Course Plan and Lecture Notes

Note: Topics, Lecture Notes, Relevant Papers and Presentations will be made available and will be updated throughout the semester.

        I.    Introduction

  1. Introduction to the Class [PPT] [PDF] (01/19)
    1. NLP Problems; Key Approaches

  2. Models of Classification and Multiclass Classification [Notes-1] [Notes-2] (01/21,01/26)
    1. Discriminative and Generative Models for Classification
    2. MultiClass Classification; Sequential Classification
    3. Constraint Classification for Multiclass classification and ranking


        II.    Basic Structured Models: Sequential Models

  3. Sequence Labeling Problems
    [Notes-1] [Notes-2] [Notes-3] (02/02, 02/11, 02/16)
    [Notes-4] [Notes-5] [Notes-6]
    (02/18, 02/23, 02/25)
    1. HMMs and CRFs
    2. Inference with Classifiers I
    3. Structured Perceptron
    4. Structured SVMs


        III.    Constrained Conditional Models

    1. Pipeline Models
    2. Integer Linear Programming
    3. Introducing Background knowledge


        IV.    Training Paradigms

    1. Decoupling Learning from Inference (L+I)
    2. Inference based Training (Joint Learning, IBT)
    3. Online and Batch Joint Learning


        V.    Unsupervised Learning and Indirect Supervision

    1. Constraints Driven Learning and Posterior Regularization
    2. Learning with latent variables
    3. Indirect Supervision


        VI.    Inference

    1. Approximate Inference
    2. Dual Decomposition


Students Lectures (Exact dates may change a bit; we will sometimes have multiple presentations on the same date)

  1. February 16: Quang Do Discriminative Training for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms
  2. February 18: Jiansong Zhang Andrew McCallum, Dayne Freitag, and Fernando Pereira, Maximum entropy Markov models for information extraction and segmentation
  3. February 23: Hongning Wang Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
  4. February 25: Abdullah Akce Max-Margin Markov Networks
  5. March 2:
  6. March 4:
  7. March 9:
  8. March 11: Tony Huang Max-Margin Parsing
  9. March 16: Joe Di Febo Discriminative Reranking for Natural Language Parsing
  10. March 18: Jason Cho Integer Linear Programming Inference for Conditional Random Fields
  11. March 30: Juan Mancilla-CaceresLearning and Inference over Constrained Output
  12. April 1:
  13. April 13: Micha Hodosh Learning Structural SVMs with Latent Variables
  14. April 15:
  15. April 20: Yonatan BiskProbabilistic CFG with Latent Annotations
  16. April 22:
  17. April 27: