ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kim Ebert <kim.eb...@perfectsearchcorp.com>
Subject Re: Permutations
Date Fri, 05 Sep 2014 16:28:37 GMT
Hi Pei and Sean,

Sean, any thoughts about this would be helpful.

We also had issues in cTAKES 2.5.

Here is the patch for 2.5. Before I got the patch to 3.0 Sean made his
changes.

=== modified file
'src/edu/mayo/bmi/lookup/algorithms/FirstTokenPermutationImpl.java'
--- src/edu/mayo/bmi/lookup/algorithms/FirstTokenPermutationImpl.java  
2012-11-28 01:56:50 +0000
+++ src/edu/mayo/bmi/lookup/algorithms/FirstTokenPermutationImpl.java  
2013-02-06 16:39:37 +0000
@@ -294,14 +294,16 @@
                         Iterator mdhIterator = mdhSet.iterator();
                         while (mdhIterator.hasNext())
                         {
-                            MetaDataHit mdh = (MetaDataHit)
mdhIterator.next();
+                            MetaDataHit mdh = (MetaDataHit)
mdhIterator.next();
+                           
+                            List permutationSorted = (List)
((ArrayList)permutation).clone();
                             // figure out start and end offsets
-                            Collections.sort(permutation);
+                            Collections.sort(permutationSorted);
 
                             int startOffset;
-                            if (permutation.size() > 0)
+                            if (permutationSorted.size() > 0)
                             {
-                                int firstIdx = ((Integer)
permutation.get(0)).intValue();
+                                int firstIdx = ((Integer)
permutationSorted.get(0)).intValue();
                                 if (firstIdx <= firstTokenIndex.intValue())
                                 {
                                     firstIdx--;
@@ -322,9 +324,9 @@
                             }
 
                             int endOffset;
-                            if (permutation.size() > 0)
+                            if (permutationSorted.size() > 0)
                             {
-                                int lastIdx = ((Integer)
permutation.get(permutation.size() - 1)).intValue();
+                                int lastIdx = ((Integer)
permutationSorted.get(permutationSorted.size() - 1)).intValue();
                                 if (lastIdx <= firstTokenIndex.intValue())
                                 {
                                     lastIdx--;


Kim Ebert
1.801.669.7342
Perfect Search Corp
http://www.perfectsearchcorp.com/

On 09/05/2014 10:17 AM, Pei Chen wrote:
> Hi Kim,
> Thanks for pointing that out.
> https://issues.apache.org/jira/browse/CTAKES-310 has been opened for
> this.
> If you commit the changes, we can see if we can include in the 3.2.1
> patch release.
> I was looking at the changelist for this file, and it may look like
> some of these optimizations may have been intentional by Sean so he
> may have some more insight in this bit of the logic.
>
> On Thu, Sep 4, 2014 at 6:22 PM, Kim Ebert
> <kim.ebert@perfectsearchcorp.com> wrote:
>> Hi All,
>>
>> I was reviewing the use of permutations, and I noticed that we sorted
>> the permutation list before creating the string to do the concept lookup
>> with. It also appears that we were sorting the object that was stored in
>> the parent list.
>>
>> I've made a few changes, and now it appears I can discover some
>> additional concepts based upon the permutations.
>>
>> Let me know what you think of the following changes.
>>
>> Thanks,
>>
>> Kim
>>
>> === modified file
>> 'ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java'
>> ---
>> ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java
>> 2014-07-31 22:00:48 +0000
>> +++
>> ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java
>> 2014-09-04 18:39:59 +0000
>> @@ -210,11 +210,12 @@
>>        final List<List<Integer>> permutationList = iv_permCacheMap.get(
>> permutationIndex );
>>        for ( List<Integer> permutations : permutationList ) {
>>           // Moved sort and offset calculation from inner (per
>> MetaDataHit) iteration 2-21-2013 spf
>> -         Collections.sort( permutations );
>> +         List<Integer> permutationsSorted = (List)
>> ((ArrayList)permutations).clone();
>> +         Collections.sort( permutationsSorted );
>>           int startOffset = firstWordStartOffset;
>>           int endOffset = firstWordEndOffset;
>> -         if ( !permutations.isEmpty() ) {
>> -            int firstIdx = permutations.get( 0 );
>> +         if ( !permutationsSorted.isEmpty() ) {
>> +            int firstIdx = permutationsSorted.get( 0 );
>>              if ( firstIdx <= firstTokenIndex ) {
>>                 firstIdx--;
>>              }
>> @@ -222,7 +223,7 @@
>>              if ( firstToken.getStartOffset() < firstWordStartOffset ) {
>>                 startOffset = firstToken.getStartOffset();
>>              }
>> -            int lastIdx = permutations.get( permutations.size() - 1 );
>> +            int lastIdx = permutationsSorted.get(
>> permutationsSorted.size() - 1 );
>>              if ( lastIdx <= firstTokenIndex ) {
>>                 lastIdx--;
>>              }
>>
>>
>> --
>> Kim Ebert
>> 1.801.669.7342
>> Perfect Search Corp
>> http://www.perfectsearchcorp.com/
>>


Mime
View raw message