
[ Overview | Publications ]
The problem of domain adaption is critical in natural language processing, given that models trained on a given domain (e.g., news) may not perform well on other domains (e.g., medical text) and that one may want to adapt learned models also across languages and across related tasks. We attempt both to (1) develop new algorithms for adaption across domains and languages and (2) develop a theoretical understanding of the domain adaptation issues and algorithms.
This is an active area of research over the last few years; research has focused on the setting where data from multiple source domains or tasks is available and can be used to improve learning performance on a new and related target task.
We have done earlier work on domain adaptation in the context of Context Sensitive Text Correction and, more recently, in the context of Transliteration. In the latter case, we assume that we already know how to recognize entities in English articles, but given very little supervision we also might like to discover entities in Russian articles. Such an approach can also be treated as an adaption algorithm. We are currently working on developing algorithms and theoretical understanding for the setting of adaptation and multitask learning.