


This project enables a "grid" to be used in the following four research projects: (1) Advanced Programming Environments for Cluster and Grids, (2) Parallel Applications for Clusters and Grids, (3) Dynamic Sequential Code Optimization, and (4) Architectures for Multimedia and Communications Applications.

A key recent educational priority has been to use summative assessment as an incentive to, and for measurement of, learner progress and school improvement. It is widely agreed, however, that this should to be supplemented by formative assessment, or assessment for screening, monitoring and diagnosis. This project aims to tackle the challenge of effective formative assessment in the area of writing. Project goals are to: 1) develop the Assess-As-You-Go Writing Assistant using an interdisciplinary team of computer scientists, measurement specialists, content area experts, and educational practitioners and 2) refine a prototype through field tests.

The goal of this research is to study an integrated theory of learning, knowledge representation and reasoning and evaluate it on large scale knowledge intensive inferences in the natural language domain. Recent studies, within the Learning to Reason fram

This research seeks to develop an integrated view - theoretical understanding, algorithms development and experimental evaluation - for learning coherent concepts. These are learning scenarios that are common in cognitive learning - where multiple learner



Recent advances in Natural Language Processing, in particular the ability to use unstructured data to answer natural language questions, are very exciting from an educational perspective. They offer the promise of systems that can automatically respond to students' questions, thus supporting not only a guided but also an open ended, exploration based, approach to learning.
The goal of this project is to apply research in Computer Science -- particularly Natural Language Processing -- and the Learning Sciences, to developing an intelligent tutor that can provide the right kind of environment for students, one that facilitates rather than inhibits inquiry through a known knowledge space and provides a jumping-off space for trying to find or generate new knowledge.
The testbed domain in this project involves high school and undergraduate level students studying concepts in BioInformatics.

The ability to speak and understand language is probably the most intricate skill that people possess. It is certainly our most uniquely human ability. This project investigates how such an important skill is acquired and continues to develop throughout o

A fundamental task in sentence comprehension involves assigning semantic roles to sentence constituents, thus determining "who does what to whom." The syntactic bootstrapping theory proposes that even very young children use precursors of the adult's knowledge of syntax to accomplish this task. The project combines experimental and computational approaches to test and refine this theory.

The Research Experiences for Undergraduates (REU) program grant supports active research participation by undergraduate students in computer science funded by the National Science Foundation.

A significant amount of the software written today interacts with naturally occurring (sensor) data such as text, speech, images and video, streams of financial data, and biological sequences, and needs to reason with respect to concepts that are complex and often difficult to define explicitly in terms of the raw data observed. In the Learning Based Programming (LBP) project, we explore a novel software engineering paradigm that allows a programmer seamless incorporation of trainable variables into the program and, consequently, the ability to reason using high-level concepts without the need to explicitly define them in terms of all the variables they might depend on, or the functional dependencies among them; these may be determined in a data-driven way, via learning operators whose details are abstracted away from the programmer.



Effective tactical decision-making requires a quick and accurate appreciation of the current and near-future situation. In today’s highly interconnected world, data relevant to situational awareness is plentiful. The objective of this project is to study a unified inference framework that can take as input models that correspond to disparate sources and modalities. The intention is to study both how to learn good models from different sources with different kinds of associated uncertainty, and how to combine these into a coherent decision, taking into account characteristics of the data as well as of its source.


The Bootstrapped Learning program seeks to make instructable computing a reality. The "electronic student" will learn from a human teacher who uses spoken language, gestures, demonstration, and many other methods one would find in a human mentored relationship. Furthermore, it will build upon learned concepts and apply that knowledge across different fields of study. This research is performed under sub-contract with the Stanford Research Institute.


The goal of the Center for Multimodal Information Access and Synthesis Center (MIAS) at UIUC is to develop the fundamental theories, computational models, algorithms, and tools for analysts to access a variety of data formats and models, to integrate them with existing resources, and to transform raw data into useful and understandable information, in support of productive and efficient analysis. We aim at extending the state-of-the-art and develop new technologies for: (1) Focused data retrieval and integration, to identify and collect relevant data from multiple modalities, (2) Semantic data enrichment, to allow navigation and search across disparate data modalities and augment knowledge bases by inferring semantics from unstructured data and images, (3) Entity identification and relationship discovery, to identify real-world entities and relate them to existing institutional resources, (4) Knowledge discovery and hypotheses generation and verification, to construct the rich semantic structure and hidden networks of entity linkages, and (5) Fundamental machine learning, database and data mining, natural language processing, and computer vision techniques required for and driven by the aforementioned problems.


A fundamental task in sentence comprehension is to assign semantic roles to sentence constituents. The structure-mapping account proposes that children start with a shallow structural analysis of sentences: children treat the number of nouns in the sentence as a cue to its semantic predicate-argument structure, and represent language experience in an abstract format that permits rapid generalization to new verbs. In this project, we test the consequences of these representational assumptions via experiments with a system for automatic semantic role labeling (SRL), trained on a sample of child-directed speech.


The ECHO Depository is a digital preservation research and development project funded by the Library of Congress under their National Digital Information Infrastructure and Preservation Program. The ECHO Depository project pulls together several streams of activities aimed at helping to answer the question of how digital resources will be identified, archived, and preserved for the future.



The project studies a machine learning centered approach to data-intensive and computing-intensive processing for intelligent context-sensitive human-machine interfaces. The future of intelligent human-machine interaction is in the ability to perform co



This project is building a capability for identifying text fragments that exhibit a set of specifiable semantic properties in large text corpora. The objective is to investigate and develop advanced learning and reasoning technologies in support of natural language understanding-related tasks. We will develop an approach to focused textual entailment in the context of text anonymizing. The research is investigating both a novel inference method for focused textual entailment, and methods for acquiring appropriate declarative knowledge required to support this inference. We will apply and evaluate it in the context of anonymizing text snippets with respect to specific goals.


Dash Optimization has generously provided the XPress-MP Optimization Suite which has served as an advanced optimization tool for the research in our group.



Communication through written language is by far the most commonly used communication channel. The ubiquity of computing devices has made authoring very easy—email, blogs and wikis are only some examples—and document production is at a higher rate than ever. Most of the authoring, by a large margin, is in English and by non-native speakers. Nonetheless, the only tool available to authors today is a text processor. Namely, writers merely use a typewriter with memory, equipped with 30-year old technology of automatic spell checking against a dictionary. Current authoring platforms offer minimal guidance with regard to the ``correctness” of a document: context sensitive mistakes, word selection and usage, sentence structure and readability, use of connectives, and so forth. We suggest that current natural language processing technologies allow us to develop a tool that can actually help writers—including native speakers of English, non-native speakers, and writers with disabilities (e.g., dyslectics)—to produce better, professional-looking English documents, email messages, and reports. We are developing an authoring assistance tool that is capable of identifying and correcting grammatical mistakes and context sensitive word usage mistakes (it’s—its, in—it; number—amount); guiding writers to select appropriate prepositions and determiners; enhancing written language by proposing appropriate adjectives and adverbs; and providing guidance for structuring a document.


Extending from our other research activities in textual entailment, the objective of this Google-sponsored project is to investigate and develop advanced learning and reasoning technologies in support of natural language understanding-related tasks. This project supports the Textual Entailment research that is primarily sponsored by Boeing Aircraft.


For some of our projects, as noted in the project descriptions, we have collaborative funding arrangements with other Departments of the University of Illinois.