ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruce Tietjen <bruce.tiet...@perfectsearchcorp.com>
Subject Differences in MedicationMention annotations on subsequent processing runs
Date Wed, 08 Oct 2014 15:37:41 GMT
I have encountered a situation in which the cTakes clinical pipeline output
differs between multiple runs on the same text with the same configuration.

The following snippets from a single document are sufficient to demonstrate
the issue:

 a gentle curve going into. irrigated with Bacitracin.


The source of the difference is that the DictionaryLookupAnnotator uses a
map to filter out duplicate annotations for a single document location:

    // used to prevent duplicate hits
    // key = hit begin,end key (java.lang.String)
    // val = Set of MetaDataHit objects
    private Map<String,Set<MetaDataHit>> iv_dupMap = new HashMap<>();


This map is shared between both the umls_ms_2011ab lookup and the
umls_ms_2011an_rxnorm lookup,

If both dictionaries contain the same term, the order of dictionary lookup
execution determines the output.If the rxnorm lookup runs first, then a
MedicationMention annotation for Bacitracin appears in the final output. If
the standard umls lookup runs first, then there is no MedicationMention
annotation for Bacitracin.

I will attach the output from the subsequent runs. (Hopefully the
attachment will make it through the system)

Is this expected behavior? If not, what would be the expected behavior?

 [image: IMAT Solutions] <http://imatsolutions.com>
 Bruce Tietjen
Senior Software Engineer
[image: Mobile:] 801.634.1547
bruce.tietjen@imatsolutions.com

Mime
View raw message