Return-Path: X-Original-To: apmail-ctakes-dev-archive@www.apache.org Delivered-To: apmail-ctakes-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E0A0EFDBE for ; Tue, 9 Apr 2013 20:31:20 +0000 (UTC) Received: (qmail 59554 invoked by uid 500); 9 Apr 2013 20:31:20 -0000 Delivered-To: apmail-ctakes-dev-archive@ctakes.apache.org Received: (qmail 59476 invoked by uid 500); 9 Apr 2013 20:31:20 -0000 Mailing-List: contact dev-help@ctakes.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ctakes.apache.org Delivered-To: mailing list dev@ctakes.apache.org Received: (qmail 59464 invoked by uid 99); 9 Apr 2013 20:31:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Apr 2013 20:31:20 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [129.176.212.47] (HELO mail10.mayo.edu) (129.176.212.47) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Apr 2013 20:31:15 +0000 Received: from roedlp004a.mayo.edu (HELO mail10.mayo.edu) ([129.176.158.14]) by ironport10-dlp.mayo.edu with ESMTP; 09 Apr 2013 15:30:54 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ag8FAEh5ZFGBsNQ1/2dsb2JhbABRgwY2wTeBFxZ0gh8BAQECAkkkGAICAgEIEQQBAQsdBxsXFAkIAQEEEwiIDAyuSIciiREEjUOBHAYyBoJaYQOIRo9Uj26DC4Io Received: from mhro1a.mayo.edu ([129.176.212.53]) by ironport10.mayo.edu with ESMTP; 09 Apr 2013 15:30:53 -0500 Received: from MSGPEXCEI06A.mfad.mfroot.org (msgpexcei06a.mayo.edu [129.176.249.167]) by mhro1a.mayo.edu with ESMTP id BT-MMP-5863160 for dev@ctakes.apache.org; Tue, 9 Apr 2013 15:30:51 -0500 Received: from MSGPEXCHA09A.mfad.mfroot.org (129.176.250.16) by MSGPEXCEI06A.mfad.mfroot.org (129.176.249.167) with Microsoft SMTP Server (TLS) id 14.2.342.4; Tue, 9 Apr 2013 15:30:51 -0500 Received: from MSGPEXCHA08A.mfad.mfroot.org ([169.254.11.54]) by MSGPEXCHA09A.mfad.mfroot.org ([169.254.12.202]) with mapi id 14.02.0342.004; Tue, 9 Apr 2013 15:30:51 -0500 From: "Masanz, James J." To: "'dev@ctakes.apache.org'" Subject: RE: ClearNLP POSTagger Thread-Topic: ClearNLP POSTagger Thread-Index: Ac40c4Z3shxWjz1UQE6yYhmWViyLAf//+EwA//+k6AD//cMwwP/7hZFw Date: Tue, 9 Apr 2013 20:30:50 +0000 Message-ID: <996FC801C05DF64A84246A106FACACD00FA0BF@MSGPEXCHA08A.mfad.mfroot.org> References: <924DE05C19409B438EB81DE683A942D9104D33F7@CHEXMBX1A.CHBOSTON.ORG> <3563F226-D0D7-4976-9078-4684178BBD02@ukp.informatik.tu-darmstadt.de> <924DE05C19409B438EB81DE683A942D9104D37FD@CHEXMBX1A.CHBOSTON.ORG> <924DE05C19409B438EB81DE683A942D9104D493B@CHEXMBX1A.CHBOSTON.ORG> In-Reply-To: <924DE05C19409B438EB81DE683A942D9104D493B@CHEXMBX1A.CHBOSTON.ORG> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.128.209.18] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Virus-Checked: Checked by ClamAV on apache.org That's great. Thanks. Is there something that describes which model to use for which AE.=20 Or maybe put something in the model filename, or put the model in a separat= e subdirectory? -- James > -----Original Message----- > From: dev-return-1482-Masanz.James=3Dmayo.edu@ctakes.apache.org [mailto:d= ev- > return-1482-Masanz.James=3Dmayo.edu@ctakes.apache.org] On Behalf Of Chen, > Pei > Sent: Tuesday, April 09, 2013 3:29 PM > To: dev@ctakes.apache.org > Subject: RE: ClearNLP POSTagger >=20 > FYI: > This has been done in trunk in r. 1466216 > https://issues.apache.org/jira/browse/CTAKES-186 > If you would like to try it out or run some benchmarks before we decide i= f > we should make the default pipeline use this, just uncomment the below in > your Aggregate Descriptors. >=20 > location=3D"../../../ctakes-pos-tagger/desc/ClearNLPPOSTagger.xml"/> > > ClearPOSTagger >=20 > > -----Original Message----- > > From: Chen, Pei [mailto:Pei.Chen@childrens.harvard.edu] > > Sent: Monday, April 08, 2013 5:14 PM > > To: dev@ctakes.apache.org > > Subject: RE: ClearNLP POSTagger > > > > Hi Richard, > > Yes- the ClearNLP tools (POSTagger, Dependency Parser, SRL) in cTAKES > > were retrained with additional data (MiPAQ/SHARP). > > The Dependency Parser/SRL replaced the existing one because the old > > ClearParser ones were no longer supported. > > > > The ClearPOSTagger wasn't previously available in cTAKES, but we can > > certainly make it an optional one in case some folks may want to use > > it. I'll leave the default one (OpenNLP) as-is for the time being > > until we get some more users/tests/benchmarks/feedback... > > > > --Pei > > > > > -----Original Message----- > > > From: Richard Eckart de Castilho [mailto:eckart@ukp.informatik.tu- > > > darmstadt.de] > > > Sent: Monday, April 08, 2013 1:43 PM > > > To: > > > Subject: Re: ClearNLP POSTagger > > > > > > Hi, > > > > > > did you train new models for the ClearNLP/OpenNLP tools? (Maybe I > > > knew if I had followed a past discussion on models more closely.) > > > > > > Cheers, > > > > > > -- Richard > > > > > > Am 08.04.2013 um 18:15 schrieb "Chen, Pei" > > > : > > > > > > > Hi, > > > > While working on the Dependency Parser/SRL labeler, we also have > > > > a > > > POSTagger from ClearNLP. It is fairly simple and I have the code > > > ready (also trained on the same data as the dep parser- MiPaq/SHARP) > > > to > > be checked-in. > > > What does the folks think: > > > > We can include both Analysis Engines in the ctakes-pos-tagger > > > > project. But > > > should we leave the current OpenNLP in the default pipeline or > > > default to the latest? > > > > > > > > "The ClearNLP POS tagger shows more robust results on unknown > > > > words > > > by generalizing lexical features. You can find the reference from > this paper. > > > > Fast and Robust Part-of-Speech Tagging Using Dynamic Model > > > > Selection, > > > Jinho D. Choi, Martha Palmer, Proceedings of the 50th Annual Meeting > > > of the Association for Computational Linguistics (ACL'12), 363-367, > > > Jeju, > > Korea, 2012. > > > [1] It also uses AdaGrad for machine learning, which is a more > > > advanced learning algorithm than maximum entropy used by OpenNLP." > > > > > > > > [1] http://aclweb.org/anthology-new/P/P12/P12-2071.pdf > > > > > > > > > -- > > > ------------------------------------------------------------------- > > > Richard Eckart de Castilho > > > Technical Lead > > > Ubiquitous Knowledge Processing Lab (UKP-TUD) FB 20 Computer Science > > > Department Technische Universit=E4t Darmstadt Hochschulstr. 10, > > > D-64289 Darmstadt, Germany phone [+49] (0)6151 16-7477, fax -5455, > > > room > > > S2/02/B117 eckart@ukp.informatik.tu-darmstadt.de > > > www.ukp.tu-darmstadt.de > > > Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de > > > -------------------------------------------------------------------