
[ Overview | Details | Participants | Publications ]
Semantic parsing of sentences is believed to be an important task on the road to natural language understanding, and has immediate applications in tasks such as information extraction and question answering. Semantic Role Labeling (SRL) is a shallow semantic parsing task, in which for each predicate in a sentence, the goal is to identify all constituents that fill a semantic role, and to determine their roles (Agent, Patient, Instrument, etc.) and their adjuncts (Locative, Temporal, Manner etc.).
We have worked both on developing a state-of-the art SRL tool, which builds on our Constrained Conditional Model, and on SRL from the perspective of developmental Psycholinguistics.We study the task of Semantic Role Labeling (SRL) in which, for each verb in a sentence, the goal is to identify all constituents that fill a semantic role, and to determine their roles, such as Agent, Patient, or Instrument, and their adjuncts, such as Locative, Temporal, or Manner. For example, given a sentence, "I left my pearls to my daughter-in-law in my will.", the goal is to identify different arguments of the verb 'left' which yields the output:
[A0 I] [V left] [A1 my pearls] [A2 to my daughter-in-law] [AM-LOC in my will].
Here A0 represents the leaver, A1 represents the bestowed item, A2 represents the benefactor, AM-LOC is an adjunct indicating the location of the action, and V determines the verb.
SRL is a difficult task, and one cannot expect high levels of performance from either purely manual classifiers or purely learned classifiers. Rather, supplemental linguistic information must be used to support and correct a learning system. So far, machine learning approaches to SRL have incorporated linguistic information only implicitly, via the classifiers' features. The key innovation in our approach is the development of a principled method to combine machine learning techniques with linguistic and structural constraints by explicitly incorporating inference into the decision process. By extending this inference to incorporate multiple SRL labeling systems, we achieved the best results in the CoNLL 2005 SRL shared task.
Through the SRL task we are also able to study different theories for language acquisition. The structure-mapping account proposes that children start with a shallow structural analysis of sentences: children treat the number of nouns in the sentence as a cue to its semantic predicate-argument structure, and represent language experience in an abstract format that permits rapid generalization to new verbs. We test this hypothesis and others via experiments with a system for automatic semantic role labeling trained on child-directed speech (BabySRL). We can mimic learning experiments conducted with children with our BabySRL using different simple representations of the input (such as an explicitly ordered set of nouns), and see if it matches the experimental findings from children. By studying the base representations and learning progressions that children use for language acquisition, we can better create automatic systems that learn and evolve with increasingly complex language input.