mssammon|csil-mm24|/scratch/tutorial/ne|[1]% cp ../ShallowParser/samples/sample0.sp . mssammon|csil-mm24|/scratch/tutorial/ne|[2]% ls analysis/ fex@ ne-newdata/ ne.scr old_data/ run_ne_newdata.script snow@ conll03.train.upd.lex ne.net ne-postprocess/ old/ orig_results/ sample0.sp mssammon|csil-mm24|/scratch/tutorial/ne|[3]% head sample0.sp [NP (NNP India) ] (, ,) [NP (NNP Nepal) ] [PP (TO to) ] (NNP Control) (NNP Terrorist) (NNPS Activities) [PP (IN At) ] [NP (DT the) (NN end) ] [PP (IN of) ] [NP (DT a) (JJ two-day) (NN meeting) ] [PP (IN of) ] [NP (DT the) (JJ fourth) (NNP India-Nepal) (NNP Joint) (NNP Working) (NNP Group) ] [PP (IN on) ] [NP (NNP Border) (NN management) ] [ADVP (RB here) ] (, ,) [NP (DT the) (CD two) (NNS neighbors) ] [ADVP (RB also) ] [VP (VBD agreed) (TO to) (VB share) ] [NP (NN intelligence) ] (IN so) (IN as) [VP (TO to) (RB effectively) (VB deal) ] [PP (IN with) ] [NP (NNS terrorists) ] (CC and) [NP (JJ undesirable) (NNS elements) ] [PP (IN across) ] [NP (DT the) (NN border) ] (, ,) [PP (VBG according) ] [PP (TO to) ] [NP (DT a) (NN press) (NN release) ] [VP (VBD issued) ] [NP (NNP Friday) ] (. .) [NP (DT The) (NN decision) ] [VP (VBD assumed) ] [NP (NN significance) ] [PP (IN in) ] [NP (DT the) (NN context) ] [PP (IN of) ] [NP (NN apprehension) ] [PP (IN in) ] [NP (NNP New) (NNP Delhi) ] (IN about) [NP (DT the) (NNS activities) ] [PP (IN of) ] [NP (NN anti-India) (NNS elements) ] [PP (IN in) ] [NP (NNP Nepal) ] [NP (WDT which) ] [VP (VBD were) (VBN suspected) ] [PP (IN behind) ] [NP (DT the) (NN hijacking) ] [PP (IN of) ] [NP (DT an) (JJ Indian) (NN flight) ] [PP (IN from) ] [NP (NNP Kathmandu) ] (IN towards) [NP (DT the) (NN end) ] [PP (IN of) ] [NP (CD 1999) ] (CC and) [NP (DT the) (NN antiIndia) (NNS riots) ] [NP (RB early) (DT this) (NN year) ] (. .) [NP (DT The) (CD two) (NNS sides) ] [VP (VBD decided) (TO to) (VB hold) ] [NP (JJ regular) (NNS meetings) ] [PP (IN of) ] [NP (DT the) (NNP Interpol) (NNS units) ] [VP (TO to) (VB expedite) ] [NP (NN disposal) ] [PP (IN of) ] [NP (VBG pending) (NNS cases) ] [PP (IN on) ] [NP (DT both) (NNS sides) ] (. .) [NP (PRP They) ] [ADVP (RB also) ] [VP (VBD agreed) (TO to) (VB commence) ] [NP (JJ expert-level) (NNS discussions) ] [PP (IN on) ] [NP (DT a) (JJ legal) (NN framework) ] [PP (IN for) ] [NP (NN cooperation) ] [PP (IN in) ] [NP (JJ criminal) (CC and) (JJ civil) (NNS matters) ] (CC and) [VP (TO to) (VB review) ] [NP (NN extradition) (NNS arrangements) ] (, ,) [VP (VBD said) ] [NP (DT the) (NN press) (NN release) ] (. .) [NP (DT The) (CD two) (NNS countries) ] [VP (MD would) (VB expedite) ] [NP (DT the) (JJ procedural) (NNS aspects) ] [PP (IN of) ] [VP (VBG improving) ] [NP (JJ infrastructural) (NNS facilities) ] [PP (IN at) ] [NP (DT the) (NN border) ] [VP (VBZ checkposts) ] (. .) [SBAR (IN While) ] [NP (DT the) (JJ Indian) (NN side) ] [VP (VBD was) (VBN led) ] [PP (IN by) ] [NP (JJ joint) (NN secretary) ] [PP (IN in) ] [NP (DT the) (NNP Home) (NNP Ministry) (NNP Surendra) (NNP Kumar) ] (, ,) [NP (DT the) (NNPS Nepalese) (NN delegation) ] [VP (VBD was) (VBN headed) ] [PP (IN by) ] [NP (NNP Tika) (NNP Dutta) (NNP Niraula) ] (, ,) [NP (JJ joint) (NN secretary) ] [PP (IN in) ] (DT the) (NNPS Nepalese) (NNP Home) (NNP Ministry) [NP (PRP It) ] [VP (VBD was) (VBN agreed) ] [SBAR (IN that) ] [NP (DT the) (JJ fifth) (NN meeting) ] [PP (IN of) ] [NP (DT the) (JJ joint) (NN working) (NN group) ] [VP (MD would) (VB be) (VBN held) ] [PP (IN in) ] [NP (NNP Kathmandu) ] [PP (IN at) ] [NP (RB mutually) (JJ convenient) (NNS dates) ] (. .) [NP (DT The) (JJ last) (NN meeting) ] [PP (IN of) ] [NP (DT the) (JJ joint) (NN group) ] [PP (IN on) ] [NP (NN border) (NN management) ] [VP (VBD was) (VBN held) ] [PP (IN in) ] [NP (NNP Kathmandu) ] [PP (IN in) ] [NP (DT the) (JJ first) (NN week) ] [PP (IN of) ] [NP (NNP February) ] [NP (JJ last) (NN year) ] (. .) [NP (NNP Singapore) ] [VP (VBZ Announces) ] [NP (JJ Second) (NN Phase) ] [PP (IN of) ] (NNP Banking) (NNP Liberalization) (NN Program) mssammon|csil-mm24|/scratch/tutorial/ne|[5]% mv sample0.sp ne-newdata mssammon|csil-mm24|/scratch/tutorial/ne|[5]% cd ne-newdata mssammon|csil-mm24|/scratch/tutorial/ne/ne-newdata|[6]% ls chunk-to-column.pl* sample0.sp mssammon|csil-mm24|/scratch/tutorial/ne/ne-newdata|[7]% ./chunk-to-column.pl Usage: ./chunk-to-column.pl at ./chunk-to-column.pl line 18. mssammon|csil-mm24|/scratch/tutorial/ne/ne-newdata|[8]% ./chunk-to-column.pl sample0.sp > sample0.col mssammon|csil-mm24|/scratch/tutorial/ne/ne-newdata|[9]% head sample0.col B-NP 0 0 B-NP NNP India x 0 0 O 0 1 O , , x 0 0 B-NP 0 2 B-NP NNP Nepal x 0 0 O 0 3 B-PP TO to x 0 0 O 0 4 O NNP Control x 0 0 O 0 5 O NNP Terrorist x 0 0 O 0 6 O NNPS Activities x 0 0 O 0 0 B-PP IN At x 0 0 B-NP 0 1 B-NP DT the x 0 0 mssammon|csil-mm24|/scratch/tutorial/ne/ne-newdata|[10]% cp sample0.col .. mssammon|csil-mm24|/scratch/tutorial/ne/ne-newdata|[11]% cd .. mssammon|csil-mm24|/scratch/tutorial/ne|[12]% ls analysis/ fex@ ne-newdata/ ne.scr old_data/ run_ne_newdata.script snow@ conll03.train.upd.lex ne.net ne-postprocess/ old/ orig_results/ sample0.col mssammon|csil-mm24|/scratch/tutorial/ne|[14]% ./fex -P 0 -r ne.scr conll03.train.upd.lex sample0.col sample0.snow.test Fex - Feature Extractor Cognitive Computations Group - University of Illinois at Urbana/Champaign Version 2.3.1 Processing... mssammon|csil-mm24|/scratch/tutorial/ne|[15]% head sample0.snow.test 1001, 1016, 1176, 1185, 1188, 1322, 1441, 1577, 1585, 1588, 35553, 35554, 35558, 94005, 94008, 140130, 298486: 1001, 1016, 1026, 1033, 1040, 1041, 1114, 1441, 1884, 1895, 1896, 5885, 35560, 35563, 58921, 58922, 58924, 174801: 1018, 1035, 1036, 1046, 1120, 1142, 1221, 1321, 1341, 1343, 1352, 1633, 1650, 1668, 1939, 2474, 2485, 3763, 3764, 3767, 3768, 3769, 3773, 3774, 6336, 6342, 46906, 46909: 1020, 1035, 1045, 1046, 1094, 1120, 1142, 1204, 1206, 1221, 1317, 1321, 1322, 1325, 1372, 1381, 1384, 1466, 1679, 1680, 1689, 2342, 2474, 2480, 2485, 2491, 3287, 3294, 3786, 3787, 3796, 3797, 6353, 32429, 32438, 33732, 46911, 46913, 46914, 46919, 46920, 46923, 259583: 1010, 1016, 1045, 1074, 1120, 1142, 1205, 1206, 1221, 1341, 1343, 1352, 1441, 1466, 1468, 1527, 1550, 1927, 2366, 2369, 3287, 3294, 3517, 3523, 7992, 9833, 9838, 15154, 15160, 15163, 19255, 19276, 19280, 46928, 46933, 58879, 58882, 65527, 65536, 83842, 83864, 83869, 112012, 126617: 1016, 1018, 1041, 1046, 1130, 1142, 1143, 1144, 1148, 1580, 1587, 1591, 2602, 3063, 3671, 3683, 3685, 3770, 7884, 7891, 10270, 10277, 65552, 65553, 65561, 104459, 104460, 277169: 1064, 1091, 1094, 1120, 1135, 1144, 1176, 1185, 1186, 1188, 1212, 1341, 1343, 1352, 1533, 1884, 1895, 1896, 2602, 3308, 3343, 3909, 3915, 4781, 4782, 4783, 4784, 4791, 4793, 4799, 7898, 7901, 14820, 43441, 43444, 65568, 65573: 1001, 1046, 1056, 1065, 1066, 1070, 1113, 1166, 1176, 1185, 1188, 1221, 1527, 2255, 2264, 2681, 4704, 4712, 13920, 13925, 49695, 49701, 83363, 83368, 88790, 88792, 177352: 1001, 1009, 1142, 1181, 1194, 1212, 1220, 1223, 1229, 1232, 1883, 1890, 1891, 1894, 2360, 2891, 2896, 2974, 15758, 15765, 23005, 51666, 51746, 51747, 51749, 212714, 212716: 1010, 1018, 1045, 1212, 1221, 1248, 1317, 1321, 1325, 1485, 1592, 1902, 1908, 1919, 1922, 2833, 2837, 2838, 2902, 2907, 4039, 4044, 10275, 15779, 15783, 23014, 107098, 118753, 148550, 148555, 237466: mssammon|csil-mm24|/scratch/tutorial/ne|[16]% head conll03.train.upd.lex 1 label[ORG] 1001 phLen[1] 1002 w[*_German]&t[*_JJ] 1003 w[*___to]&t[*___TO] 1004 w[*__call]&t[*__NN] 1005 w[*rejects]&t[*VBZ] 1006 w[*EU]&t[*NNP] 1007 w[EU]&t[NNP] 1008 t[*VBZ] 1009 t[*_JJ] mssammon|csil-mm24|/scratch/tutorial/ne|[17]% ls analysis/ fex@ ne-newdata/ ne.scr old_data/ run_ne_newdata.script sample0.snow.test conll03.train.upd.lex ne.net ne-postprocess/ old/ orig_results/ sample0.col snow@ mssammon|csil-mm24|/scratch/tutorial/ne|[18]% ./snow -test -I sample0.snow.test -F ne.net -o winners -R sample0.res SNoW+ - Sparse Network of Winnows Plus Cognitive Computations Group - University of Illinois at Urbana-Champaign Version 3.2.0 Input file: 'sample0.snow.test' Network file: 'ne.net' Directing output to file 'sample0.res' mssammon|csil-mm24|/scratch/tutorial/ne|[19]% head sample0.res Algorithm information: Winnow: (1.35, 0.8, 4, 0.3435) Targets: 1-5 4 4 5 5 5 5 5 5 mssammon|csil-mm24|/scratch/tutorial/ne|[23]% cp sample0.res ne-postprocess mssammon|csil-mm24|/scratch/tutorial/ne|[24]% cd ne-postprocess mssammon|csil-mm24|/scratch/tutorial/ne/ne-postprocess|[25]% ls applyLabels.pl* conll03.train.upd.lex numbersToLabels.pl* sample0.col sample0.res mssammon|csil-mm24|/scratch/tutorial/ne/ne-postprocess|[26]% ./numbersToLabels.pl Usage: ./numbersToLabels.pl snow_winners lexicon [ > output] at ./numbersToLabels.pl line 13. mssammon|csil-mm24|/scratch/tutorial/ne/ne-postprocess|[27]% ./numbersToLabels.pl sample0.res conll03.train.upd.lex > sample0.labels mssammon|csil-mm24|/scratch/tutorial/ne/ne-postprocess|[28]% head sample0.labels LOC LOC OTHER OTHER OTHER OTHER OTHER OTHER OTHER OTHER mssammon|csil-mm24|/scratch/tutorial/ne/ne-postprocess|[29]% ./applyLabels.pl Usage: ./applyLabels.pl input.col labelfile [> output.col] at ./applyLabels.pl line 14. mssammon|csil-mm24|/scratch/tutorial/ne/ne-postprocess|[31]% head sample0.col B-NP 0 0 B-NP NNP India x 0 0 O 0 1 O , , x 0 0 B-NP 0 2 B-NP NNP Nepal x 0 0 O 0 3 B-PP TO to x 0 0 O 0 4 O NNP Control x 0 0 O 0 5 O NNP Terrorist x 0 0 O 0 6 O NNPS Activities x 0 0 O 0 0 B-PP IN At x 0 0 B-NP 0 1 B-NP DT the x 0 0 mssammon|csil-mm24|/scratch/tutorial/ne/ne-postprocess|[32]% ./applyLabels.pl sample0.col sample0.labels > sample0.final mssammon|csil-mm24|/scratch/tutorial/ne/ne-postprocess|[33]% head sample0.final B-LOC 0 0 B-NP NNP India x 0 0 O 0 1 O , , x 0 0 B-LOC 0 2 B-NP NNP Nepal x 0 0 O 0 3 B-PP TO to x 0 0 O 0 4 O NNP Control x 0 0 O 0 5 O NNP Terrorist x 0 0 O 0 6 O NNPS Activities x 0 0 O 0 0 B-PP IN At x 0 0 B-OTHER 0 1 B-NP DT the x 0 0 mssammon|csil-mm24|/scratch/tutorial/ne/ne-postprocess|[34]%