ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen, Pei" <Pei.C...@childrens.harvard.edu>
Subject ClearNLP POSTagger
Date Mon, 08 Apr 2013 16:15:10 GMT
While working on the Dependency Parser/SRL labeler,  we also have a POSTagger from ClearNLP.
 It is fairly simple and I have the code ready (also trained on the same data as the dep parser-
MiPaq/SHARP) to be checked-in.  What does the folks think:
We can include both Analysis Engines in the ctakes-pos-tagger project.  But should we leave
the current OpenNLP in the default pipeline or default to the latest?

"The ClearNLP POS tagger shows more robust results on unknown words by generalizing lexical
features.  You can find the reference from this paper.
Fast and Robust Part-of-Speech Tagging Using Dynamic Model Selection, Jinho D. Choi, Martha
Palmer, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics
(ACL'12), 363-367, Jeju, Korea, 2012. [1] It also uses AdaGrad for machine learning, which
is a more advanced learning algorithm than maximum entropy used by OpenNLP."

[1] http://aclweb.org/anthology-new/P/P12/P12-2071.pdf

View raw message