incubator-ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Finan, Sean" <Sean.Fi...@childrens.harvard.edu>
Subject RE: assistance with dictionary lookup issue
Date Tue, 05 Feb 2013 14:44:26 GMT
My two cents, for what they are worth ...

You never want to rely upon an implicit contract between developer and code.  What I mean
is that you never know how another developer (or you at a later time) might use something.
 If a method can receive a parameter of a type other than required by that method, then you
must assume that at some point it will.  Unless we utilize a subclass "SortedXXX" or flag
"isSorted()" or something of that kind, random ordering must be assumed and dealt with - even
if it may lead to redundant sorts.  If the items are assumed to be sorted already, then an
insertion sort is ok.

My vote is to leave the sort in there.

-----Original Message-----
From: Tim Miller [mailto:timothy.miller@childrens.harvard.edu] 
Sent: Monday, February 04, 2013 5:35 PM
To: ctakes-dev@incubator.apache.org
Subject: Re: assistance with dictionary lookup issue

What do we know about under what circumstances an annotation will be sorted?

On 02/04/2013 05:01 PM, Masanz, James J. wrote:
> I'll take a look at the patch. Also be aware of https://issues.apache.org/jira/browse/CTAKES-31
which talks about a way of enhancing performance  -- if willing to assume annotations (BaseTokens
currently) are sorted. Currently it's always BaseToken and always sorted, just not sure if
we want to code to that assumption.
>
> ________________________________________
> From: 
> ctakes-dev-return-1137-Masanz.James=mayo.edu@incubator.apache.org 
> [ctakes-dev-return-1137-Masanz.James=mayo.edu@incubator.apache.org] on 
> behalf of Tim Miller [timothy.miller@childrens.harvard.edu]
> Sent: Monday, February 04, 2013 3:43 PM
> To: ctakes-dev@incubator.apache.org
> Subject: assistance with dictionary lookup issue
>
> Pei helped me track down an issue with performance I'd noticed in the 
> dictionary annotator, and I have filed the issue here:
> https://issues.apache.org/jira/browse/CTAKES-143
>
> I implemented a quick and dirty proof of concept fix and noticed 
> dramatic performance improvement.  I attached the patch to the issue, 
> but it involves changing an interface (currently does not try to fix 
> other implementing classes so obviously not ready for primetime), so I 
> wanted to solicit the list first in case anyone with better knowledge 
> of that module has some better engineering ideas than what I came up with.
>
> Thanks,
>
> --
> Tim Miller, PhD
> Postdoctoral Research Fellow
> Children's Hospital Informatics Program Children's Hospital Boston and 
> Harvard Medical School
> 617-919-1223


Mime
View raw message