ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen, Pei" <Pei.C...@childrens.harvard.edu>
Subject RE: ClearNLP POSTagger
Date Tue, 09 Apr 2013 20:28:39 GMT
FYI:
This has been done in trunk in r. 1466216
https://issues.apache.org/jira/browse/CTAKES-186
If you would like to try it out or run some benchmarks before we decide if we should make
the default pipeline use this, just uncomment the below in your Aggregate Descriptors.

<delegateAnalysisEngine key="ClearPOSTagger">
<import location="../../../ctakes-pos-tagger/desc/ClearNLPPOSTagger.xml"/>
</delegateAnalysisEngine>
<node>ClearPOSTagger</node> 

> -----Original Message-----
> From: Chen, Pei [mailto:Pei.Chen@childrens.harvard.edu]
> Sent: Monday, April 08, 2013 5:14 PM
> To: dev@ctakes.apache.org
> Subject: RE: ClearNLP POSTagger
> 
> Hi Richard,
> Yes- the ClearNLP tools (POSTagger, Dependency Parser, SRL) in cTAKES
> were retrained with additional data (MiPAQ/SHARP).
> The Dependency Parser/SRL replaced the existing one because the old
> ClearParser ones were no longer supported.
> 
> The ClearPOSTagger wasn't previously available in cTAKES, but we can
> certainly make it an optional one in case some folks may want to use it.  I'll
> leave the default one (OpenNLP) as-is for the time being until we get some
> more users/tests/benchmarks/feedback...
> 
> --Pei
> 
> > -----Original Message-----
> > From: Richard Eckart de Castilho [mailto:eckart@ukp.informatik.tu-
> > darmstadt.de]
> > Sent: Monday, April 08, 2013 1:43 PM
> > To: <dev@ctakes.apache.org>
> > Subject: Re: ClearNLP POSTagger
> >
> > Hi,
> >
> > did you train new models for the ClearNLP/OpenNLP tools? (Maybe I knew
> > if I had followed a past discussion on models more closely.)
> >
> > Cheers,
> >
> > -- Richard
> >
> > Am 08.04.2013 um 18:15 schrieb "Chen, Pei"
> > <Pei.Chen@childrens.harvard.edu>:
> >
> > > Hi,
> > > While working on the Dependency Parser/SRL labeler,  we also have a
> > POSTagger from ClearNLP.  It is fairly simple and I have the code
> > ready (also trained on the same data as the dep parser- MiPaq/SHARP) to
> be checked-in.
> > What does the folks think:
> > > We can include both Analysis Engines in the ctakes-pos-tagger
> > > project.  But
> > should we leave the current OpenNLP in the default pipeline or default
> > to the latest?
> > >
> > > "The ClearNLP POS tagger shows more robust results on unknown words
> > by generalizing lexical features.  You can find the reference from this paper.
> > > Fast and Robust Part-of-Speech Tagging Using Dynamic Model
> > > Selection,
> > Jinho D. Choi, Martha Palmer, Proceedings of the 50th Annual Meeting
> > of the Association for Computational Linguistics (ACL'12), 363-367, Jeju,
> Korea, 2012.
> > [1] It also uses AdaGrad for machine learning, which is a more
> > advanced learning algorithm than maximum entropy used by OpenNLP."
> > >
> > > [1] http://aclweb.org/anthology-new/P/P12/P12-2071.pdf
> >
> >
> > --
> > -------------------------------------------------------------------
> > Richard Eckart de Castilho
> > Technical Lead
> > Ubiquitous Knowledge Processing Lab (UKP-TUD) FB 20 Computer Science
> > Department Technische Universit├Ąt Darmstadt Hochschulstr. 10, D-64289
> > Darmstadt, Germany phone [+49] (0)6151 16-7477, fax -5455, room
> > S2/02/B117 eckart@ukp.informatik.tu-darmstadt.de
> > www.ukp.tu-darmstadt.de
> > Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
> > -------------------------------------------------------------------


Mime
View raw message