Return-Path: X-Original-To: apmail-incubator-ctakes-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-ctakes-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 19B42E463 for ; Mon, 4 Feb 2013 22:34:44 +0000 (UTC) Received: (qmail 88650 invoked by uid 500); 4 Feb 2013 22:34:44 -0000 Delivered-To: apmail-incubator-ctakes-dev-archive@incubator.apache.org Received: (qmail 88623 invoked by uid 500); 4 Feb 2013 22:34:43 -0000 Mailing-List: contact ctakes-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: ctakes-dev@incubator.apache.org Delivered-To: mailing list ctakes-dev@incubator.apache.org Received: (qmail 88615 invoked by uid 99); 4 Feb 2013 22:34:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Feb 2013 22:34:43 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of Timothy.Miller@childrens.harvard.edu designates 134.174.13.91 as permitted sender) Received: from [134.174.13.91] (HELO mailsmtp1.childrenshospital.org) (134.174.13.91) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Feb 2013 22:34:38 +0000 Received: from pps.filterd (mailsmtp1.childrenshospital.org [127.0.0.1]) by mailsmtp1.childrenshospital.org (8.14.5/8.14.5) with SMTP id r14MXemK014333 for ; Mon, 4 Feb 2013 17:34:17 -0500 Received: from smtpndc2.chboston.org (smtpndc2.chboston.org [10.20.50.105]) by mailsmtp1.childrenshospital.org with ESMTP id 1a9ycjns2t-1 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT) for ; Mon, 04 Feb 2013 17:34:17 -0500 Received: from pps.filterd (smtpndc2.chboston.org [127.0.0.1]) by smtpndc2.chboston.org (8.14.5/8.14.5) with SMTP id r14MWGvP002992 for ; Mon, 4 Feb 2013 17:34:16 -0500 Received: from chexhubcas2.chboston.org (internal-ndc-nat-v1260.tch.harvard.edu [10.20.50.4]) by smtpndc2.chboston.org with ESMTP id 1a392s3vcq-1 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT) for ; Mon, 04 Feb 2013 17:34:16 -0500 Received: from [10.7.2.218] (10.7.2.218) by email.tch.harvard.edu (10.20.50.93) with Microsoft SMTP Server (TLS) id 14.2.309.2; Mon, 4 Feb 2013 17:34:16 -0500 Message-ID: <51103760.90903@childrens.harvard.edu> Date: Mon, 4 Feb 2013 17:34:08 -0500 From: Tim Miller User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2 MIME-Version: 1.0 To: Subject: Re: assistance with dictionary lookup issue References: <51102B9B.3000206@childrens.harvard.edu> <996FC801C05DF64A84246A106FACACD00772DB@MSGPEXCHA08A.mfad.mfroot.org> In-Reply-To: <996FC801C05DF64A84246A106FACACD00772DB@MSGPEXCHA08A.mfad.mfroot.org> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.7.2.218] X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.9.8327,1.0.431,0.0.0000 definitions=2013-02-04_16:2013-02-04,2013-02-04,1970-01-01 signatures=0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.9.8327,1.0.431,0.0.0000 definitions=2013-02-04_16:2013-02-04,2013-02-04,1970-01-01 signatures=0 X-Virus-Checked: Checked by ClamAV on apache.org What do we know about under what circumstances an annotation will be sorted? On 02/04/2013 05:01 PM, Masanz, James J. wrote: > I'll take a look at the patch. Also be aware of https://issues.apache.org/jira/browse/CTAKES-31 which talks about a way of enhancing performance -- if willing to assume annotations (BaseTokens currently) are sorted. Currently it's always BaseToken and always sorted, just not sure if we want to code to that assumption. > > ________________________________________ > From: ctakes-dev-return-1137-Masanz.James=mayo.edu@incubator.apache.org [ctakes-dev-return-1137-Masanz.James=mayo.edu@incubator.apache.org] on behalf of Tim Miller [timothy.miller@childrens.harvard.edu] > Sent: Monday, February 04, 2013 3:43 PM > To: ctakes-dev@incubator.apache.org > Subject: assistance with dictionary lookup issue > > Pei helped me track down an issue with performance I'd noticed in the > dictionary annotator, and I have filed the issue here: > https://issues.apache.org/jira/browse/CTAKES-143 > > I implemented a quick and dirty proof of concept fix and noticed > dramatic performance improvement. I attached the patch to the issue, > but it involves changing an interface (currently does not try to fix > other implementing classes so obviously not ready for primetime), so I > wanted to solicit the list first in case anyone with better knowledge of > that module has some better engineering ideas than what I came up with. > > Thanks, > > -- > Tim Miller, PhD > Postdoctoral Research Fellow > Children's Hospital Informatics Program > Children's Hospital Boston and Harvard Medical School > 617-919-1223