
[ Overview | Details | Participants | Publications ]
Dependency trees provide a syntactic representation that encodes functional relationships between words; it is relatively independent of the grammar theory and can be used to represent the structure of sentences in different languages. Dependency structures are more efficient to parse and are believed to be easier to learn, yet they still capture much of the predicate-argument information needed in applications, which is one reason for the recent interest in learning these structures.
In this project, we build a general framework for automatic dependency parsing based on a pipeline approach, where a task is decomposed into several sequential stages. To overcome the error accumulation problem of pipeline models, we propose two intuitive principles for pipeline frameworks: (1) make local decisions as reliable as possible, and (1) reduce the number of decisions which are made. We show that the proposed principles support several algorithmic choices and improve the dependency-parsing accuracy.
We study this framework in the context of designing a bottom-up dependency parsing. Not only do we manage to use this framework to justify several design decisions, but we also show experimentally that following these results improves the accuracy of the inferred trees relative to existing models. Interestingly, we can show that the trees produced by our algorithm are relatively good even for long sentences, and that our algorithm performs especially well when evaluated globally, at a sentence level, where our results are significantly better than those of existing approaches, perhaps showing that the design goals were achieved.
The results also show that our system performs very well on multilingual dependency parsing.