Return-Path: X-Original-To: apmail-ctakes-dev-archive@www.apache.org Delivered-To: apmail-ctakes-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D4F02115BF for ; Fri, 5 Sep 2014 16:17:17 +0000 (UTC) Received: (qmail 93904 invoked by uid 500); 5 Sep 2014 16:17:17 -0000 Delivered-To: apmail-ctakes-dev-archive@ctakes.apache.org Received: (qmail 93858 invoked by uid 500); 5 Sep 2014 16:17:17 -0000 Mailing-List: contact dev-help@ctakes.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ctakes.apache.org Delivered-To: mailing list dev@ctakes.apache.org Received: (qmail 93847 invoked by uid 99); 5 Sep 2014 16:17:17 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Sep 2014 16:17:17 +0000 Received: from localhost (HELO mail-ig0-f171.google.com) (127.0.0.1) (smtp-auth username chenpei, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Sep 2014 16:17:17 +0000 Received: by mail-ig0-f171.google.com with SMTP id l13so37402iga.10 for ; Fri, 05 Sep 2014 09:17:16 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.50.128.46 with SMTP id nl14mr6300663igb.48.1409933836780; Fri, 05 Sep 2014 09:17:16 -0700 (PDT) Received: by 10.50.217.229 with HTTP; Fri, 5 Sep 2014 09:17:16 -0700 (PDT) In-Reply-To: <5408E608.7040602@perfectsearchcorp.com> References: <5408E608.7040602@perfectsearchcorp.com> Date: Fri, 5 Sep 2014 12:17:16 -0400 Message-ID: Subject: Re: Permutations From: Pei Chen To: "dev@ctakes.apache.org" Content-Type: text/plain; charset=UTF-8 Hi Kim, Thanks for pointing that out. https://issues.apache.org/jira/browse/CTAKES-310 has been opened for this. If you commit the changes, we can see if we can include in the 3.2.1 patch release. I was looking at the changelist for this file, and it may look like some of these optimizations may have been intentional by Sean so he may have some more insight in this bit of the logic. On Thu, Sep 4, 2014 at 6:22 PM, Kim Ebert wrote: > Hi All, > > I was reviewing the use of permutations, and I noticed that we sorted > the permutation list before creating the string to do the concept lookup > with. It also appears that we were sorting the object that was stored in > the parent list. > > I've made a few changes, and now it appears I can discover some > additional concepts based upon the permutations. > > Let me know what you think of the following changes. > > Thanks, > > Kim > > === modified file > 'ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java' > --- > ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java > 2014-07-31 22:00:48 +0000 > +++ > ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java > 2014-09-04 18:39:59 +0000 > @@ -210,11 +210,12 @@ > final List> permutationList = iv_permCacheMap.get( > permutationIndex ); > for ( List permutations : permutationList ) { > // Moved sort and offset calculation from inner (per > MetaDataHit) iteration 2-21-2013 spf > - Collections.sort( permutations ); > + List permutationsSorted = (List) > ((ArrayList)permutations).clone(); > + Collections.sort( permutationsSorted ); > int startOffset = firstWordStartOffset; > int endOffset = firstWordEndOffset; > - if ( !permutations.isEmpty() ) { > - int firstIdx = permutations.get( 0 ); > + if ( !permutationsSorted.isEmpty() ) { > + int firstIdx = permutationsSorted.get( 0 ); > if ( firstIdx <= firstTokenIndex ) { > firstIdx--; > } > @@ -222,7 +223,7 @@ > if ( firstToken.getStartOffset() < firstWordStartOffset ) { > startOffset = firstToken.getStartOffset(); > } > - int lastIdx = permutations.get( permutations.size() - 1 ); > + int lastIdx = permutationsSorted.get( > permutationsSorted.size() - 1 ); > if ( lastIdx <= firstTokenIndex ) { > lastIdx--; > } > > > -- > Kim Ebert > 1.801.669.7342 > Perfect Search Corp > http://www.perfectsearchcorp.com/ >