ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <eck...@ukp.informatik.tu-darmstadt.de>
Subject Re: ClearNLP POSTagger
Date Mon, 08 Apr 2013 17:42:28 GMT

did you train new models for the ClearNLP/OpenNLP tools? (Maybe I knew if I had followed a
past discussion on models more closely…)


-- Richard

Am 08.04.2013 um 18:15 schrieb "Chen, Pei" <Pei.Chen@childrens.harvard.edu>:

> Hi,
> While working on the Dependency Parser/SRL labeler,  we also have a POSTagger from ClearNLP.
 It is fairly simple and I have the code ready (also trained on the same data as the dep parser-
MiPaq/SHARP) to be checked-in.  What does the folks think:
> We can include both Analysis Engines in the ctakes-pos-tagger project.  But should we
leave the current OpenNLP in the default pipeline or default to the latest?
> "The ClearNLP POS tagger shows more robust results on unknown words by generalizing lexical
features.  You can find the reference from this paper.
> Fast and Robust Part-of-Speech Tagging Using Dynamic Model Selection, Jinho D. Choi,
Martha Palmer, Proceedings of the 50th Annual Meeting of the Association for Computational
Linguistics (ACL'12), 363-367, Jeju, Korea, 2012. [1] It also uses AdaGrad for machine learning,
which is a more advanced learning algorithm than maximum entropy used by OpenNLP."
> [1] http://aclweb.org/anthology-new/P/P12/P12-2071.pdf

Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab (UKP-TUD) 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de

View raw message