ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pei Chen <chen...@apache.org>
Subject Re: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor
Date Wed, 04 Sep 2013 17:02:25 GMT
Ted,

> On another note, I know the cTAKES dictionary uses ICD9, but I'm not
familiar> with how to access that information: In the example I've
described below,

> where would I locate the ICD9 for a specific entity?

Even though ICD9 is include in the lookup, IRRC, cTAKES by default is
configured[1] only returns/stores concepts [2] that have a SNOMEDCT code or
RxNorm code.

[1]
http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup-res/src/main/resources/org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml

[2]
http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/ae/UmlsToSnomedConsumerImpl.java

 If you would like it to return ICD9 codes, one would need to
modify/configure the above...

--Pei


On Wed, Sep 4, 2013 at 11:55 AM, Assur, Ted
<Theodore.Assur@providence.org>wrote:

> Thanks for looking into this, it's been puzzling me.
>
> On another note, I know the cTAKES dictionary uses ICD9, but I'm not
> familiar with how to access that information: In the example I've described
> below, where would I locate the ICD9 for a specific entity?
>
> Thank you
>
> Ted
>
> -----Original Message-----
> From: Pei Chen [mailto:chenpei@apache.org]
> Sent: Tuesday, September 03, 2013 7:13 PM
> To: dev@ctakes.apache.org
> Subject: Re: specificity in selecting EntityMentions when using
> AggregatePlaintextUMLSProcessor
>
> You're right, it should have gotten "CIN I"- that's a strange one,
> probably needs to be debugged/looked into further...
>
> On Tue, Sep 3, 2013 at 10:05 PM, Miller, Timothy <
> Timothy.Miller@childrens.harvard.edu> wrote:
> > Ah. So it will get
> > CIN 2 (in SNOMED)
> > CIN III (in SNOMED)
> > CIN 3 (in SNOMED)
> >
> > but the rest are not in SNOMED?
> >
> > I wonder why it doesn't get CIN I? It looks like that exists in SNOMED
> > (though I don't fully understand what all the symbols mean in the umls
> > browser).
> >
> >> CIN I - Cervical intraepithelial neoplasia 1
> >> [A3002690/SNOMEDCT/SY/285836003]
> >
> >
> > On 09/03/2013 09:55 PM, Pei Chen wrote:
> >> It has the correct parse (POS, chunks, and lookupwindow)- but some of
> >> the terms do not exist in SNOMED- CIN 2 - Cervical intraepithelial
> >> neoplasia 2 [A3002688/SNOMEDCT/SY/285838002] exists but not CIN II.
> >> CIN III [A3333965/SNOMEDCT/SY/20365006] also exists that's why it was
> >> able to perform the lookup successfully.
> >> Note that CIN II synonyms do exist in other umls thersauses such as
> >> MEDCIN, CCPSS though.  However, the bundled cTAKES dictionaries only
> >> contain (MeSH, SNOMEDCT, RxNORM, NCI, ICD9) IRRC.
> >>
> >> --Pei
> >>
> >> On Tue, Sep 3, 2013 at 9:44 PM, Miller, Timothy
> >> <Timothy.Miller@childrens.harvard.edu> wrote:
> >>> That is a good question, Ted!
> >>>
> >>> I tried it with a simple context: "The patient has a CIN III." I'm
> >>> not sure if that is a correct context but I was able to duplicate
> >>> your findings. (Finds a CUI for CIN III but not if you change it to
> >>> CIN II)
> >>>
> >>> My first thought was that it is the chunker. But the chunker seems
> >>> to get it right, as CIN II and CIN III are both called NPs, and
> >>> similarly the LookupWindowAnnotator handles them both identically.
> >>> So that suggests it is a problem with the actual lookup of the
> >>> tokens in the LookupWindow.
> >>>
> >>> That's all I can do for now but maybe someone else who knows more
> >>> about its behavior offhand will have an idea.
> >>>
> >>> Tim
> >>>
> >>>
> >>>
> >>>
> >>> On 09/03/2013 08:24 PM, Assur, Ted wrote:
> >>>> I'm trying to understand what would prevent the
> AggregatePlaintextUMLSProcessor AE from correctly parsing specific problems
> that are defined in the UMLS version used by cTAKES.
> >>>>
> >>>> For example,
> >>>> CIN (Cervical Intraepithelial Neoplasia) in its general usage is
> parsed out as UMLS CUI C0206708.
> >>>>
> >>>> CIN comes in 3 grades, 1, 2 and 3. Sometimes this is reported with
> Roman Numerals, I,II, and III.
> >>>>
> >>>> cTAKES correctly identifies "CIN 3" and "CIN III" with UMLS CUI
> C0851140: "Carcinoma in situ of uterine cervix."
> >>>>
> >>>> However, I cannot get it to recognize CIN 1, CIN I, CIN 2, or CIN II
> as their correct concepts, "Cervical intraepithelial neoplasia grade 1" and
> "Cervical intraepithelial neoplasia grade 2" respectively.
> >>>>
> >>>> Is there a way to tune the detection of UMLS concepts?
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --------------------------------------------
> >>>> Ted Assur
> >>>> IT Solutions Architect for Cancer Research Providence Health &
> >>>> Services ted.assur@providence.org
> >>>> 503-215-6476
> >>>>
> >>>> Crede, ut intelligas.
> >>>> Intellego, ut credam.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>   ________________________________
> >>>>
> >>>> This message is intended for the sole use of the addressee, and may
> contain information that is privileged, confidential and exempt from
> disclosure under applicable law. If you are not the addressee you are
> hereby notified that you may not use, copy, disclose, or distribute to
> anyone the message or any information contained in the message. If you have
> received this message in error, please immediately advise the sender by
> reply email and delete this message.
> >>>>
> >
>
>
> ________________________________
>
> This message is intended for the sole use of the addressee, and may
> contain information that is privileged, confidential and exempt from
> disclosure under applicable law. If you are not the addressee you are
> hereby notified that you may not use, copy, disclose, or distribute to
> anyone the message or any information contained in the message. If you have
> received this message in error, please immediately advise the sender by
> reply email and delete this message.
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message