Packages

SNoW Learning Architecture  

SNoW is a learning architecture that is tailored for learning in the presence of a very large number of information sources (features). SNoW learns a network of linear functions.

 

Learning Based Java  

Learning Based Java (LBJ) is a modeling language that expedites the development of systems with one or more learning components. In an LBJ model, simple learned components are modeled conditionally, and their initial predictions are then combined via constrained optimization, yielding an expressive, globally coherent set of final predictions.

A Relational Feature Extraction Language (FEX)  

FEX is a feature extraction package used to provide input to machine learning algorithms. FEX can be used to generate features from structured text or other relational data.

 

Semantic Role Labeler (SRL)  

The Semantic Role Labeler identifies the verb-argument structure in a sentence. Specifically, it labels the sentence with Propbank-style labels. This tool is a machine-learning based system that uses SNoW and FEX for local classification decisions, and Integer Linear Programming to make global inferences about sets of these local decisions.

LBJ Part of Speech Tagger  

This is an implementation of our SNoW-based POS tagger for use with LBJ.

 

LBJ Chunker  

A classifier that partitions plain text into sequences of semantically related words, indicating a shallow (i.e., non-hierarchical) phrase structure.

LBJ Named Entity Tagger  

This is a state of the art NE tagger that tags plain text with named entitites (people / organizations / locations / miscellaneous). It uses gazetteers extracted from Wikipedia, word class model derived from unlabeled text and expressive non-local features. The best performance is 90.8 F1 on the CoNLL03 shared task data.

 

An algorithm for unsupervised rank aggregation  

An implementation of an unsupervised learning algorithm for rank aggregation with distance-based models.

Lifted First-Order Probabilistic Inference  

This package implements a Lifted First-Order Probabilistic Inference algorithm.

 

SNoW-based NE Tagger  

SNOW-based Named Entity Tagger. Has been replaced by the LBJ NER tagger (also available on the download page). The tagger reads plain text and annotates entities with labels Person, Location, Organization or Misc. Optionally, it can also give more specific labels using a comprehensive set of lists.

A SNoW-based Shallow Parser  

Identifies the phrase structure in a sentence after being trained on labeled data.

 

A SNoW-based Part of Speech Tagger  

The POS tagger makes use of the Sequential Model. This is a model that facilitates the learning and evaluation of the learned function in cases where the number of potential targets for each decision is large (in this case, there are about 50 different Part Of Speech tags).

CoRanker: an algorithm for NE discovery  

An implementation of CoRanker, an algorithm for Named Entity discovery from multilingual comparable corpora.

 

stringsim library: string similarity functions  

A c++ library of string similarity functions.

TrecWN library: a WordNet interface  

A library of c++ functions that allow you to interact with WordNet.

 

cogcomp c++ library  

A collection of useful general-purpose c++ functions.

pySNoW: A Python interface to SNoW  

pySNoW is a minimal python interface to the SNoW - Sparse Network of Winnows learning architecture. It is meant to be faithful to the original command line interface and provides access to the train, test, evaluate, interactive and server modes directly from python. pySNoW requires SNoW version 3.2.0.

 

LBJ-based Coreference Package  

A Coreference Resolver, based on LBJ, trained on the ACE corpus.

Car Detection software  

This software was used in the research described in the paper, "Learning to Detect Objects in Images via a Sparse, Part-Based Representation". It is provided 'as-is', and includes a README file to orient users. If you use this code or the data it was designed for, please cite the above work.