The SNoW Learning Architecture
The SNoW (Sparse Network of Winnows) learning architecture is a
multi-class classifier that is specifically tailored for large scale
learning tasks and fpr domains in which the potential number of
features taking part in decisions is very large, but may be unknown a
priori.
It learns a sparse network of linear functions in which the targets
concepts (class labels) are represented as linear functions over a
common feature space.
Several update rules, Winnow, Perceptron and naive Bayes, can be used
within SNoW. The SNoW learning architecture inherits its
generalization properties from the update rule being used. In this way,
when using Winnow, it is a feature efficient learning algorithm,
in that it scales linearlly with the number of relevant features,
and linearly with the number of features active in the domain.
However, there are a few differences worth mentioning relative to
simply using the basic update rule, which we briefly describe here in
the context of the Winnow update rule, the most successful one in most
applications.
- SNoW allows the use of a variable input size via the ``infinite
attribute domain''.
- SNoW is more expressive than the basin Winnow rule. The basic
Winnow update rule makes use of positive weights only. Standard
augmentation, e.g., via the duplication trick [Littlestone88] are
infeasible in high dimensional spaces since they diminish the gain
from using variable size examples (since half of the features become
active). More sophisticated approaches such as using the ``balanced''
version of Winnow apply only to the case of two classes, while SNoW is
a multi-class classifier.
- SNoW includes Feature pruning methods.
- A prediction confidence mechanism [Carlson,Rosen, Roth,2001].
- Data driven allocation of features and links
- The decision support mechanism
- An integration with a relational feature extraction mechanism
(FEX) which provides the ability to incorporate external information
sources (features) in a flexible way.
SNoW has been used successfully on a variety of large scale learning
tasks in the natural language domain and, recently, in the visual
processing domain. It can learn and generalize from a small number of
examples and thus adapts well to new environments.
Some more details about the architecture, its interpretation as a
relational system, the sparse update rules incorporated into it
(Winnow, naive Bayes, Perceptron) and their theoretical justification,
the relations to Valiant's neuroidal model and other computational
properties are described in the following papers.
SNoW Papers:
User Guide:
-
Andrew J. Carlson, Chad M. Cumby, Jeff
L. Rosen and Dan Roth
SNoW
User's Guide.
UIUC Tech report UIUC-DCS-R-99-210.
General:
Natural Language:
-
M.Munoz, V. Punyakanok, D. Roth,
D. Zimak
A Learning Approach to Shallow Parsing
In Submission,
1999,
(abstract)
An earlier
version appeared in EMNLP-WVLC'99, June 1999.
-
A. R. Golding and D. Roth,
A Winnow-Based
Approach to Spelling Correction
Machine Learning, Special issue on Machine Learning and Natural
Language Processing, Volume 34, pp. 107-130 ,1999. (
gzipped
ps,
ps,
abstract,
early
version (ICML'96) )
-
D. Roth and D. Zelenko,
Part of
Speech Tagging Using a Network of Linear Separators
COLING-ACL '98, August 1998, (abstract)
-
Y. Krymolowski and D. Roth,
Incorporating
Knowledge in Natural Language Learning: A Case Study
COLING-ACL '98 Workshop on the Usage of WordNet in Natural Language
Processing Systems, August 1998, (abstract)
-
I. Dagan and Y. Karov and D. Roth,
Mistake-Driven
Learning in Text Categorization
EMNLP '97, 2nd Conference on Empirical Methods in Natural Language
Processing, August 1997 (abstract)
Visual Processing: