ctakes-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prashasti Agrawal <prashasti.agra...@wincere.com>
Subject Re: Inconsistent IdentifiedAnnotation in different runs
Date Fri, 24 Jul 2015 07:07:56 GMT
Hi Chen Pie,


I figured out where the problem was. But I am not able to figure out the reason or solution.I
had configured my own dictionary from the UMLS knowledge sources. I had made two tables in
MySQL, one containing CUIs from SNOMEDCT source (umls_snomed_2015, for disease, symptoms etc)
and the other containing CUIs from RXNORM (umls_rxNorm_2015 for medication).  After a lot
of debugging and print statements, I figured out that in lookUpConsumer(UmlstoSnomedComsumerDbImpl),
lookup hits are being matched against the valid TUIs in DICT_UMLS_MS sometimes, and against
valid TUIs in DICT_RXNORM_MS sometimes. I have attached the LookUpDesc_Db file for reference.


<?xml version="1.0" encoding="UTF-8"?>

<!--


    Licensed to the Apache Software Foundation (ASF) under one

    or more contributor license agreements.  See the NOTICE file

    distributed with this work for additional information

    regarding copyright ownership.  The ASF licenses this file

    to you under the Apache License, Version 2.0 (the

    "License"); you may not use this file except in compliance

    with the License.  You may obtain a copy of the License at


      http://www.apache.org/licenses/LICENSE-2.0


    Unless required by applicable law or agreed to in writing,

    software distributed under the License is distributed on an

    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY

    KIND, either express or implied.  See the License for the

    specific language governing permissions and limitations

    under the License.


-->

<lookupSpecification>

<!--  Defines what dictionaries will be used in terms of implementation specifics and metaField
configuration. -->

<dictionaries>

<dictionary id="DICT_UMLS_MS" externalResourceKey="DbConnection" caseSensitive="false">

<implementation>

<jdbcImpl tableName="umls_ms_2015"/>

</implementation>

<lookupField fieldName="fword"/>

<metaFields>

<metaField fieldName="cui"/>

<metaField fieldName="tui"/>

<metaField fieldName="text"/>

</metaFields>

</dictionary>

<dictionary id="DICT_RXNORM_MS" externalResourceKey="DbConnection" caseSensitive="false">

<implementation>

<jdbcImpl tableName="umls_rxNorm_2015"/>

</implementation>

<lookupField fieldName="fword"/>

<metaFields>

<metaField fieldName="cui"/>

<metaField fieldName="tui"/>

<metaField fieldName="text"/>

</metaFields>

</dictionary>

</dictionaries>

<!-- Binds together the components necessary to perform the complete lookup logic start
to end. -->

<lookupBindings>

<lookupBinding>

<dictionaryRef idRef="DICT_UMLS_MS"/>

<lookupInitializer className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">

<properties>

<property key="textMetaFields" value="text"/>

<property key="maxPermutationLevel" value="7"/>

<!-- <property key="windowAnnotations" value="org.apache.ctakes.typesystem.type.textspan.Sentence"/>
-->

<property key="windowAnnotations" value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>

<property key="exclusionTags" value="VB,VBD,VBG,VBN,VBP,VBZ,CC,CD,DT,EX,IN,LS,MD,PDT,POS,PP,PP$,PRP,PRP$,RP,TO,WDT,WP,WPS,WRB"/>

</properties>

</lookupInitializer>

<lookupConsumer className="org.apache.ctakes.dictionary.lookup.ae.UmlsToSnomedDbConsumerImpl">

<properties>

<property key="codingScheme" value="SNOMED"/>

<property key="cuiMetaField" value="cui"/>

<property key="tuiMetaField" value="tui"/>

<property key="textMetaField" value="text"/>

<property key="anatomicalSiteTuis" value="T021,T022,T023,T024,T025,T026,T029,T030"/>

<property key="procedureTuis" value="T060,T061"/>

<property key="disorderTuis" value="T019,T020,T037,T046,T047,T048,T049,T050,T190,T191"/>

<property key="findingTuis" value="T033,T034,T040,T041,T042,T043,T044,T045,T056,T057,T184"/>

<property key="labTuis" value="T059,T116"/>

<property key="dbConnExtResrcKey" value="DbConnection"/>

<property key="mapPrepStmt" value="select code from umls_snomed_map where cui=?"/>

</properties>

</lookupConsumer>

</lookupBinding>

<lookupBinding>

<dictionaryRef idRef="DICT_RXNORM_MS"/>

<lookupInitializer className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">

<properties>

<property key="textMetaFields" value="text"/>

<property key="maxPermutationLevel" value="7"/>

<!-- <property key="windowAnnotations" value="org.apache.ctakes.typesystem.type.textspan.Sentence"/>
-->

<property key="windowAnnotations" value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>

<property key="exclusionTags" value="VB,VBD,VBG,VBN,VBP,VBZ,CC,CD,DT,EX,IN,LS,MD,PDT,POS,PP,PP$,PRP,PRP$,RP,TO,WDT,WP,WPS,WRB"/>

</properties>


</lookupInitializer>

<lookupConsumer className="org.apache.ctakes.dictionary.lookup.ae.UmlsToSnomedDbConsumerImpl">

<properties>

<property key="codingScheme" value="RXNORM"/>

<property key="cuiMetaField" value="cui"/>

<property key="tuiMetaField" value="tui"/>

<property key="textMetaField" value="text"/>

<property key="medicationTuis" value="T073,T103,T109,T110,T111,T115,T121,T122,T123,T130,T168,T192,T195,T197,T200,T203
"/>

<property key="dbConnExtResrcKey" value="DbConnection"/>

<property key="mapPrepStmt" value="select code from umls_rxNorm_map where cui=?"/>

</properties>

</lookupConsumer>

</lookupBinding>

</lookupBindings>

</lookupSpecification>




Regards,

Prashasti Agrawal | Data Engineer | Noida INDIA | GMT +5:30 hours

Mobile +91 9818812484 | prashasti.agrawal<mailto:prashasti.agrawal@wincere.com>@wincere.com<mailto:prashasti.agrawal@wincere.com>
 |



www.wincere.com<http://www.wincere.com/>

DISCLAIMER: This electronic transmission is governed by Wincere Inc. Any views or opinions
expressed in this email are solely those of the author and do not necessarily reflect the
opinions of Wincere Inc. If you have received this email in error, please delete all copies
from your system and notify the sender or contact us at: +1 855 855 2946<tel:%2B1%20855%20855%202946>
or support@wincere.com<mailto:support@wincere.com>.




________________________________
From: Chen, Pei <Pei.Chen@childrens.harvard.edu>
Sent: Friday, July 24, 2015 12:11 AM
To: user@ctakes.apache.org
Subject: RE: Inconsistent IdentifiedAnnotation in different runs


By any chance,

Are you running this in multi threaded mode within the same JVM? And do you have LVG included
in the pipeline?

I vaguely recall there were some non-thread safe code in the LVG component (don't recall if
the fix was made in the latest release yet.)



If it's still returning the behavior, would you be able to help recreate it with sample/dummy
examples that could be shared? In particular the output xmi files?

--Pei



From: Prashasti Agrawal [mailto:prashasti.agrawal@wincere.com]
Sent: Thursday, July 23, 2015 5:05 AM
To: user@ctakes.apache.org
Subject: Inconsistent IdentifiedAnnotation in different runs



Hi,



I am running AggregatePlainTextUMLSProcessor analysis engine on a EMR document. I have added
some modules like drug NER and template filler in the pipeline. I am getting different Identified
Annotations in different runs on the same document. (For example, in 8 DiseaseDisorderMention
in one run, while 15 in other).



I am unable to understand why is this so. What am I missing here?



Regards,

Prashasti Agrawal | Data Engineer | Noida INDIA | GMT +5:30 hours

Mobile +91 9818812484 | prashasti.agrawal<mailto:prashasti.agrawal@wincere.com>@wincere.com<mailto:prashasti.agrawal@wincere.com>
 |



www.wincere.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.wincere.com_&d=BQMFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WY&m=U6__j_v3_B-W5JMJPciXAfZyN4BN_Fi4g6GcMDx8LuM&s=C7gs6IxajIF4w8cHqxyNVfyc1IinBBkEpGRa8efVTko&e=>

DISCLAIMER: This electronic transmission is governed by Wincere Inc. Any views or opinions
expressed in this email are solely those of the author and do not necessarily reflect the
opinions of Wincere Inc. If you have received this email in error, please delete all copies
from your system and notify the sender or contact us at: +1 855 855 2946<tel:%2B1%20855%20855%202946>
or support@wincere.com<mailto:support@wincere.com>.



Mime
View raw message