ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Finan, Sean" <Sean.Fi...@childrens.harvard.edu>
Subject RE: Differences in MedicationMention annotations on subsequent processing runs
Date Wed, 08 Oct 2014 15:46:33 GMT
Hi Bruce,
I would venture to say that this is neither expected nor desired.

Before you fix it (or in addition to a fix), try to run with the new dictionary lookup.  
It will have a different behavior, and it will be the default dictionary lookup in future
releases of cTakes – making fixes to the current module slightly less urgent.


From: Bruce Tietjen [mailto:bruce.tietjen@perfectsearchcorp.com]
Sent: Wednesday, October 08, 2014 11:38 AM
To: dev@ctakes.apache.org
Subject: Differences in MedicationMention annotations on subsequent processing runs

I have encountered a situation in which the cTakes clinical pipeline output differs between
multiple runs on the same text with the same configuration.
The following snippets from a single document are sufficient to demonstrate the issue:

 a gentle curve going into. irrigated with Bacitracin.

The source of the difference is that the DictionaryLookupAnnotator uses a map to filter out
duplicate annotations for a single document location:
    // used to prevent duplicate hits
    // key = hit begin,end key (java.lang.String)
    // val = Set of MetaDataHit objects
    private Map<String,Set<MetaDataHit>> iv_dupMap = new HashMap<>();

This map is shared between both the umls_ms_2011ab lookup and the umls_ms_2011an_rxnorm lookup,

If both dictionaries contain the same term, the order of dictionary lookup execution determines
the output.If the rxnorm lookup runs first, then a MedicationMention annotation for Bacitracin
appears in the final output. If the standard umls lookup runs first, then there is no MedicationMention
annotation for Bacitracin.
I will attach the output from the subsequent runs. (Hopefully the attachment will make it
through the system)

Is this expected behavior? If not, what would be the expected behavior?

[Image removed by sender. IMAT Solutions]<http://imatsolutions.com>
Bruce Tietjen
Senior Software Engineer
[Image removed by sender. Mobile:]801.634.1547
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message