ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vogel, James" <JVo...@activehealth.net>
Subject RE: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor
Date Sat, 28 Sep 2013 00:29:52 GMT
I now see that I use a query on umls_ms_2011ab where sourcetype = 'ICD9CM'.  Is there a way
to use an existing AE or class to add additional ICD9CM annotations / concepts or do I change
the code in consumeHits() or getSnomedCodes()?

-----Original Message-----
From: Vogel, James
Sent: Friday, September 27, 2013 6:30 PM
To: dev@ctakes.apache.org
Subject: RE: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor

Is anyone able to provide any more detailed guidance on what I'd need to change to add the
ICD9 codes as tags, e.g., where do I look for the tables in the hsql database that would contain
the ICD9 data?

Thanks.

-----Original Message-----
From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu]
Sent: Monday, September 16, 2013 7:25 AM
To: dev@ctakes.apache.org
Subject: Re: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor

James,
I haven't done it myself, so I don't know exactly how the config
changes, but I know roughly where to look.  In the LookupDesc_Db.xml,
the <lookupBinding> tag with the idRef = DICT_UMLS_MS. Then look under
the <lookupConsumer> section, and you'll see the codingScheme is SNOMED.
I believe this is where the actual dictionary filtering is done. There
is also a consumer class called
org.apache.ctakes.dictionary.lookup.ae.UmlsToSnomedDbConsumerImpl and a
mapPrepStmt field with a SQL query that might need changing. That is
where I would start looking, I'm not sure whether you would need to
write a new consumer class, and what values the codingScheme field can
take, but hopefully this helps you get started until someone else chimes
in with more detailed info!

Tim

On 09/15/2013 08:39 PM, Vogel, James wrote:
> Any more guidance you can give about the nature of the changes to the config and impl
that would need to be made to get the ICD9 codes?
>
> -----Original Message-----
> From: Pei Chen [mailto:chenpei@apache.org]
> Sent: Wednesday, September 04, 2013 1:02 PM
> To: dev@ctakes.apache.org
> Subject: Re: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor
>
> Ted,
>
>> On another note, I know the cTAKES dictionary uses ICD9, but I'm not
> familiar> with how to access that information: In the example I've
> described below,
>
>> where would I locate the ICD9 for a specific entity?
> Even though ICD9 is include in the lookup, IRRC, cTAKES by default is
> configured[1] only returns/stores concepts [2] that have a SNOMEDCT code or
> RxNorm code.
>
> [1]
> http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup-res/src/main/resources/org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml
>
> [2]
> http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/ae/UmlsToSnomedConsumerImpl.java
>
>  If you would like it to return ICD9 codes, one would need to
> modify/configure the above...
>
> --Pei
>
>
> On Wed, Sep 4, 2013 at 11:55 AM, Assur, Ted
> <Theodore.Assur@providence.org>wrote:
>
>> Thanks for looking into this, it's been puzzling me.
>>
>> On another note, I know the cTAKES dictionary uses ICD9, but I'm not
>> familiar with how to access that information: In the example I've described
>> below, where would I locate the ICD9 for a specific entity?
>>
>> Thank you
>>
>> Ted
>>
>> -----Original Message-----
>> From: Pei Chen [mailto:chenpei@apache.org]
>> Sent: Tuesday, September 03, 2013 7:13 PM
>> To: dev@ctakes.apache.org
>> Subject: Re: specificity in selecting EntityMentions when using
>> AggregatePlaintextUMLSProcessor
>>
>> You're right, it should have gotten "CIN I"- that's a strange one,
>> probably needs to be debugged/looked into further...
>>
>> On Tue, Sep 3, 2013 at 10:05 PM, Miller, Timothy <
>> Timothy.Miller@childrens.harvard.edu> wrote:
>>> Ah. So it will get
>>> CIN 2 (in SNOMED)
>>> CIN III (in SNOMED)
>>> CIN 3 (in SNOMED)
>>>
>>> but the rest are not in SNOMED?
>>>
>>> I wonder why it doesn't get CIN I? It looks like that exists in SNOMED
>>> (though I don't fully understand what all the symbols mean in the umls
>>> browser).
>>>
>>>> CIN I - Cervical intraepithelial neoplasia 1
>>>> [A3002690/SNOMEDCT/SY/285836003]
>>>
>>> On 09/03/2013 09:55 PM, Pei Chen wrote:
>>>> It has the correct parse (POS, chunks, and lookupwindow)- but some of
>>>> the terms do not exist in SNOMED- CIN 2 - Cervical intraepithelial
>>>> neoplasia 2 [A3002688/SNOMEDCT/SY/285838002] exists but not CIN II.
>>>> CIN III [A3333965/SNOMEDCT/SY/20365006] also exists that's why it was
>>>> able to perform the lookup successfully.
>>>> Note that CIN II synonyms do exist in other umls thersauses such as
>>>> MEDCIN, CCPSS though.  However, the bundled cTAKES dictionaries only
>>>> contain (MeSH, SNOMEDCT, RxNORM, NCI, ICD9) IRRC.
>>>>
>>>> --Pei
>>>>
>>>> On Tue, Sep 3, 2013 at 9:44 PM, Miller, Timothy
>>>> <Timothy.Miller@childrens.harvard.edu> wrote:
>>>>> That is a good question, Ted!
>>>>>
>>>>> I tried it with a simple context: "The patient has a CIN III." I'm
>>>>> not sure if that is a correct context but I was able to duplicate
>>>>> your findings. (Finds a CUI for CIN III but not if you change it to
>>>>> CIN II)
>>>>>
>>>>> My first thought was that it is the chunker. But the chunker seems
>>>>> to get it right, as CIN II and CIN III are both called NPs, and
>>>>> similarly the LookupWindowAnnotator handles them both identically.
>>>>> So that suggests it is a problem with the actual lookup of the
>>>>> tokens in the LookupWindow.
>>>>>
>>>>> That's all I can do for now but maybe someone else who knows more
>>>>> about its behavior offhand will have an idea.
>>>>>
>>>>> Tim
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 09/03/2013 08:24 PM, Assur, Ted wrote:
>>>>>> I'm trying to understand what would prevent the
>> AggregatePlaintextUMLSProcessor AE from correctly parsing specific problems
>> that are defined in the UMLS version used by cTAKES.
>>>>>> For example,
>>>>>> CIN (Cervical Intraepithelial Neoplasia) in its general usage is
>> parsed out as UMLS CUI C0206708.
>>>>>> CIN comes in 3 grades, 1, 2 and 3. Sometimes this is reported with
>> Roman Numerals, I,II, and III.
>>>>>> cTAKES correctly identifies "CIN 3" and "CIN III" with UMLS CUI
>> C0851140: "Carcinoma in situ of uterine cervix."
>>>>>> However, I cannot get it to recognize CIN 1, CIN I, CIN 2, or CIN
II
>> as their correct concepts, "Cervical intraepithelial neoplasia grade 1" and
>> "Cervical intraepithelial neoplasia grade 2" respectively.
>>>>>> Is there a way to tune the detection of UMLS concepts?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --------------------------------------------
>>>>>> Ted Assur
>>>>>> IT Solutions Architect for Cancer Research Providence Health &
>>>>>> Services ted.assur@providence.org
>>>>>> 503-215-6476
>>>>>>
>>>>>> Crede, ut intelligas.
>>>>>> Intellego, ut credam.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>   ________________________________
>>>>>>
>>>>>> This message is intended for the sole use of the addressee, and may
>> contain information that is privileged, confidential and exempt from
>> disclosure under applicable law. If you are not the addressee you are
>> hereby notified that you may not use, copy, disclose, or distribute to
>> anyone the message or any information contained in the message. If you have
>> received this message in error, please immediately advise the sender by
>> reply email and delete this message.
>>
>> ________________________________
>>
>> This message is intended for the sole use of the addressee, and may
>> contain information that is privileged, confidential and exempt from
>> disclosure under applicable law. If you are not the addressee you are
>> hereby notified that you may not use, copy, disclose, or distribute to
>> anyone the message or any information contained in the message. If you have
>> received this message in error, please immediately advise the sender by
>> reply email and delete this message.
>>
>>
> IMPORTANT WARNING: Information contained in this email is intended for the use of the
individual to whom it is addressed, and may contain information that is privileged, confidential,
and exempt from disclosure under applicable law. If you are not the intended recipient, or
the employee or agent responsible for delivering the message to the intended recipient, you
are hereby notified that any dissemination, distribution, or copying of this communication
is STRICTLY FORBIDDEN. If you have received this communication in error, please notify us
immediately by return email and delete this document. Thank you.
>


IMPORTANT WARNING: Information contained in this email is intended for the use of the individual
to whom it is addressed, and may contain information that is privileged, confidential, and
exempt from disclosure under applicable law. If you are not the intended recipient, or the
employee or agent responsible for delivering the message to the intended recipient, you are
hereby notified that any dissemination, distribution, or copying of this communication is
STRICTLY FORBIDDEN. If you have received this communication in error, please notify us immediately
by return email and delete this document. Thank you.

Mime
View raw message