
New: A jar file has been released, no compiling is now required.
An online demo of the system is available here.
This is a state of the art NER tagger that tags plain text with named entitites (people / organizations / locations / miscellaneous). It uses gazetteers extracted from Wikipedia, word class model derived from unlabeled text and expressive non-local features. The best performance is 90.8 F1 on the CoNLL03 shared task data. The tagger is robust and has been evaluated on a variety of datasets. For detailed results, design and modeling details, please read the paper:
If you're using this sytem, please cite:
L. Ratinov and D. Roth
Design Challenges and Misconceptions in Named Entity Recognition
CoNLL 2009