Return-Path: X-Original-To: apmail-ctakes-dev-archive@www.apache.org Delivered-To: apmail-ctakes-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6F00911D3C for ; Wed, 18 Jun 2014 17:49:55 +0000 (UTC) Received: (qmail 37292 invoked by uid 500); 18 Jun 2014 17:49:55 -0000 Delivered-To: apmail-ctakes-dev-archive@ctakes.apache.org Received: (qmail 37240 invoked by uid 500); 18 Jun 2014 17:49:55 -0000 Mailing-List: contact dev-help@ctakes.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ctakes.apache.org Delivered-To: mailing list dev@ctakes.apache.org Received: (qmail 37225 invoked by uid 99); 18 Jun 2014 17:49:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Jun 2014 17:49:54 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of Pei.Chen@childrens.harvard.edu designates 134.174.13.91 as permitted sender) Received: from [134.174.13.91] (HELO mailsmtp1.childrenshospital.org) (134.174.13.91) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Jun 2014 17:49:52 +0000 Received: from pps.filterd (mailsmtp1.childrenshospital.org [127.0.0.1]) by mailsmtp1.childrenshospital.org (8.14.5/8.14.5) with SMTP id s5IHmePr007084 for ; Wed, 18 Jun 2014 13:49:21 -0400 Received: from smtpndc2.chboston.org (smtpndc2.chboston.org [10.20.50.105]) by mailsmtp1.childrenshospital.org with ESMTP id 1mjyc93qqk-1 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 18 Jun 2014 13:49:20 -0400 Received: from pps.filterd (smtpndc2.chboston.org [127.0.0.1]) by smtpndc2.chboston.org (8.14.5/8.14.5) with SMTP id s5IHfiHq018889 for ; Wed, 18 Jun 2014 13:49:20 -0400 Received: from chexhubcas4.chboston.org (internal-ndc-nat-v1260.tch.harvard.edu [10.20.50.4]) by smtpndc2.chboston.org with ESMTP id 1meghssw2v-1 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT) for ; Wed, 18 Jun 2014 13:49:20 -0400 Received: from CHEXMBX1A.CHBOSTON.ORG ([fe80::3c05:8ca9:55a6:f320]) by CHEXHUBCAS4.CHBOSTON.ORG ([::1]) with mapi id 14.03.0169.001; Wed, 18 Jun 2014 13:49:20 -0400 From: "Chen, Pei" To: "dev@ctakes.apache.org" Subject: RE: Preparing for an Apache cTAKES 3.2 Release? Thread-Topic: Preparing for an Apache cTAKES 3.2 Release? Thread-Index: Ac+BrOz3SZ+ZgcqDSjiRmgYJxuflnQH54P4AAAeF6rAADphDUAASSCsQADniPyA= Date: Wed, 18 Jun 2014 17:49:19 +0000 Message-ID: <924DE05C19409B438EB81DE683A942D910834147@CHEXMBX1A.CHBOSTON.ORG> References: <924DE05C19409B438EB81DE683A942D910814B4B@CHEXMBX1A.CHBOSTON.ORG> <18C6CA5A-D790-4048-935F-1199D51E68E0@gmail.com> <924DE05C19409B438EB81DE683A942D910817665@CHEXMBX1A.CHBOSTON.ORG> <1524F11D-BC53-4B50-926A-D6A9387ED9CE@gmail.com> <393252F14C42F946952F1ED75D316CAD39143724@CHEXMBX4A.CHBOSTON.ORG> <924DE05C19409B438EB81DE683A942D91081CEF0@CHEXMBX1A.CHBOSTON.ORG> <21FA267D-91D8-4665-9847-4C5C4B66D0DE@childrens.harvard.edu> <393252F14C42F946952F1ED75D316CAD39143C14@CHEXMBX4A.CHBOSTON.ORG> <924DE05C19409B438EB81DE683A942D91081D7F1@CHEXMBX1A.CHBOSTON.ORG> In-Reply-To: <924DE05C19409B438EB81DE683A942D91081D7F1@CHEXMBX1A.CHBOSTON.ORG> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.7.2.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.12.52,1.0.14,0.0.0000 definitions=2014-06-18_06:2014-06-17,2014-06-18,1970-01-01 signatures=0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.12.52,1.0.14,0.0.0000 definitions=2014-06-18_11:2014-06-18,2014-06-18,1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 kscore.is_bulkscore=3.27871047978157e-07 kscore.compositescore=1 circleOfTrustscore=805.392 compositescore=0.411528110256591 urlsuspect_oldscore=0.0115281102565911 suspectscore=0 recipient_domain_to_sender_totalscore=17270 phishscore=0 bulkscore=0 kscore.is_spamscore=0 recipient_to_sender_totalscore=0 recipient_domain_to_sender_domain_totalscore=31594 rbsscore=0.411528110256591 spamscore=0 recipient_to_sender_domain_totalscore=0 urlsuspectscore=0.9 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1406180206 X-Virus-Checked: Checked by ClamAV on apache.org Renamed to *-fast. =20 Again, this is only temporary... this will eventually just replace the exis= ting dictionary lookup (next minor release?). > -----Original Message----- > From: Chen, Pei [mailto:Pei.Chen@childrens.harvard.edu] > Sent: Tuesday, June 17, 2014 10:14 AM > To: dev@ctakes.apache.org > Subject: RE: Preparing for an Apache cTAKES 3.2 Release? >=20 > Yes. It's only temporary to give folks a chance try out and transition t= o the > new lookup algorithm (hence, the +1 for the -fast suffix rename). > But open to biting the bullet and defaulting it now if folks are compelle= d to > do so. >=20 > > -----Original Message----- > > From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu] > > Sent: Monday, June 16, 2014 11:36 AM > > To: dev@ctakes.apache.org > > Subject: RE: Preparing for an Apache cTAKES 3.2 Release? > > > > I guess that I've got one question at this point: > > > > Is the name being given to the -new- dictionary lookup module > > temporary or permanent? > > > > I was under the assumption that it was temporary and that with the > > switch to it being default (and eventually only) the module would > > simply be named "dictionary-lookup". > > > > > > > > -----Original Message----- > > From: Masanz, James J. [mailto:Masanz.James@mayo.edu] > > Sent: Monday, June 16, 2014 11:24 AM > > To: 'dev@ctakes.apache.org' > > Subject: RE: Preparing for an Apache cTAKES 3.2 Release? > > > > I'd rather something else than "dictionary-lookup-fast". If we come up > > with something even faster than this one, having an older one called > > "fast" could be confusing. > > > > -----Original Message----- > > From: Dligach, Dmitriy [mailto:Dmitriy.Dligach@childrens.harvard.edu] > > Sent: Monday, June 16, 2014 9:55 AM > > To: cTAKES Developer list > > Subject: Re: Preparing for an Apache cTAKES 3.2 Release? > > > > +1 > > > > Dima > > > > > > > > > > On Jun 16, 2014, at 9:42, Miller, Timothy > > wrote: > > > > > Sorry to weigh in so late on this -- just returned from vacation. If > > > we want to have a one release delay before making dictionary2 > > > default for testing/documentation/configuration purposes, and there > > > isn't an obvious function-related name, and the main difference is > > > speed, maybe we could call it dictionary-lookup-fast? Besides being > > > accurate and more descriptive than "2", it might lure people into > > > trying it and give us some feedback. > > > > > > Tim > > > > > > > > > On 06/16/2014 10:34 AM, Chen, Pei wrote: > > >> I'm making some significant updates to trunk that may cause some > > instability for this release. > > >> It should be mostly transparent, but let me know if you encounter > > >> any > > issues with trunk. > > >> > > >> Also, regarding the dictionary-lookup2. If there are no strong > > >> objections, > > we can leave default to as-is (old behavior). Folks who wish to give > > the new one a try are welcome to do so and we can change the default > > behavior in a future release. > > >> > > >> [ducks for cover now] > > >> --Pei > > >> > > >>> -----Original Message----- > > >>> From: ksarma@gmail.com [mailto:ksarma@gmail.com] On Behalf Of > > >>> Karthik Sarma > > >>> Sent: Wednesday, June 11, 2014 9:58 AM > > >>> To: dev@ctakes.apache.org > > >>> Subject: Re: Preparing for an Apache cTAKES 3.2 Release? > > >>> > > >>> Agreed > > >>> > > >>> On Wednesday, June 11, 2014, vijay garla wrote: > > >>> > > >>>> regardless of the name, I think it would be incredibly helpful to > > >>>> have thorough documentation on the dictionary lookup, how to > > >>>> configure it, and how to create new dictionaries. I would > > >>>> venture to say that this is the most important component in > > >>>> cTAKES, and probably the one that has generated the most > > >>>> questions on the > > newsgroup. > > >>>> > > >>>> > > >>>> > > >>>> On Wed, Jun 11, 2014 at 9:21 AM, Finan, Sean < > > >>>> Sean.Finan@childrens.harvard.edu> wrote: > > >>>> > > >>>>>> . The newer NER should have in its name the Behavior... > > >>>>> I agree, but the *2 module is a complete replacement for the > > >>>>> current lookup. It does not (really) have any different > > >>>>> behavior, just a > > >>>> different > > >>>>> implementation and performance. We plan to swap out the old > > >>>>> with the new in the next release and get rid of the *2 suffix. > > >>>>> So, any name provided now is just temporary - unless people > > >>>>> don't like the name "dictionary-lookup" at all. > > >>>>> > > >>>>> In my original sandbox it was named "RareWordLookup", a nod to > > >>>>> its implementation. However, this doesn't help any users. > > >>>>> > > >>>>> Sean > > >>>>> > > >>>>> -----Original Message----- > > >>>>> From: andy mcmurry [mailto:mcmurry.andy@gmail.com] > > >>>>> Sent: Wednesday, June 11, 2014 3:09 AM > > >>>>> To: dev@ctakes.apache.org > > >>>>> Subject: Re: Preparing for an Apache cTAKES 3.2 Release? > > >>>>> > > >>>>> "2" doesn't mean much. The newer NER should have in its name the > > >>>>> Behavior... > > >>>>> > > >>>>> Perhaps something like MetaMap Usage > > >>>>> "-- > > >>> allow_overmatches" > > >>>>> or "--allow_concept_gaps" or .....other? > > >>>>> > > >>>>> Since yTex already provides a pluggable *DictionaryLookup, *that > > >>>>> seems like the best place to define the differing Behavior / Usa= ge. > > >>>>> > > >>>>> https://cwiki.apache.org/confluence/display/CTAKES/User's+Guide > > >>>>> https://code.google.com/p/ytex/wiki/DictionaryLookup_V05 > > >>>>> > > >>>>> > > >>>>> AndyMC > > >>>>> > > >>>>> On Tue, Jun 10, 2014 at 9:55 AM, britt fitch > > >>>>> > > >>>>> wrote: > > >>>>> > > >>>>>> I don't have an issue with the *-2 name. I also don't have any > > >>>>>> objections to renaming it. > > >>>>>> > > >>>>>> It might be nice to keep the old dictionary code around for a > > >>>>>> release-worth of time but after that I would vote purging it. > > >>>>>> If someone needs it after that it'll be accessible in the > > >>>>>> archived releases. > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> On Jun 10, 2014, at 12:48 PM, Chen, Pei > > >>>>>> > > >>>>>> wrote: > > >>>>>> > > >>>>>>> I think James has a fair point here. > > >>>>>>> It may be worthwhile biting the bullet here and push forward. > > >>>>>>> > > >>>>>>> Since this essentially will be a full replacement of the > > >>>>>> ctakes-dictionary-lookup module, a good option maybe to just > > >>>>>> replace the entire module now and rename the existing module to > > >>>>>> * > > >>> _deprecated. > > >>>>>>> How do folks feel about that? In a nutshell, > > >>>>>>> ctakes-dictionary-lookup-2 > > >>>>>> is a faster algorithm with a simpler code base- and comparable > > >>>>>> results (Sean has a full comparison in the documentation for > > >>>>>> those who are > > >>>>> curious). > > >>>>>>> --Pei > > >>>>>>> > > >>>>>>>> -----Original Message----- > > >>>>>>>> From: britt fitch [mailto:britt.fitch@gmail.com] > > >>>>>>>> Sent: Monday, June 09, 2014 5:42 PM > > >>>>>>>> To: dev@ctakes.apache.org > > >>>>>>>> Subject: Re: Preparing for an Apache cTAKES 3.2 Release? > > >>>>>>>> > > >>>>>>>> There is some documentation in the dictionary2 module under > > >>>>>>>> /doc/DictionaryLookupHelp.{txt | docx} that gives some some > > >>>>>>>> details of > > >>>>>> the > > >>>>>>>> different lookup implementation options within that module > > >>>>>>>> that I found helpful. > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> On Jun 9, 2014, at 5:17 PM, Masanz, James J. > > >>>>>>>> < > > >>> > > >>> > > >>> -- > > >>> > > >>> > > >>> > > >>> > > >>> -- > > >>> Karthik Sarma > > >>> UCLA Medical Scientist Training Program Class of 20?? > > >>> Member, UCLA Medical Imaging & Informatics Lab Member, CA > > Delegation > > >>> to the House of Delegates of the American Medical Association > > >>> ksarma@ksarma.com > > >>> gchat: ksarma@gmail.com > > >>> linkedin: www.linkedin.com/in/ksarma > > > > > > -- > > > Tim Miller > > > Instructor > > > Boston Children's Hospital and Harvard Medical School > > > timothy.miller@childrens.harvard.edu > > > 617-919-1223 > > >