ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Green" <john.travis.gr...@gmail.com>
Subject RE: Procedure
Date Thu, 17 Jul 2014 18:44:45 GMT
Wonderful explanation James, thank you!


JG
—
Sent from Mailbox for iPhone

On Thu, Jul 17, 2014 at 2:41 PM, Masanz, James J. <Masanz.James@mayo.edu>
wrote:

> The order you mentioned in your previous email had been  "pulsatile abdominal mass" 
for both what is in UMLS and what was in the text being annotated, which is why I was asking
about the ordering.
> Given that I now know the text you were annotating did have a different word order than
what is in umls, and seeing exactly what those orderings were, that explains why it was not
being picked up.
> A quirk/feature of cTAKES (current) dictionary lookup (as opposed to the newer one called
lookup-2) is that the first word must be first, but in a multi (>2) word entry, the order
of the other words doesn't matter.
> So for example, with "abdominal pulsatile mass" in the dictionary, both of these should
get annotated with the same cui
> abdominal pulsatile mass
> abdominal mass pulsatile
> but this will not get an annotation for that CUI
> pulsatile abdominal mass
> unless that ordering is also in the dictionary.
> As far as heart rate and temperature, whether they are annotated as procedures all depends
on if they show up in the UMLS with the semantic types used by cTAKES.
> To check those, I would do this
> - Open the UMLS terminology services Metathesaurus Browser app 
>       https://uts.nlm.nih.gov/home.html
>       Applications->UTS Metathesaurus Browser
> - input the text of interest into the box in the left pane, and click Go
> - select the CUI that looks hopeful
> - the pane on the right will fill in with details about that Concept, including the semantic
all the Atoms.
> - look at the Semantic Types in the pane on the right
> - if not a semantic type that cTAKES annotates, select a different CUI
> - Once found a CUI with a semantic type cTAKES annotates, if the text of the UMLS Concept
itself is not exactly what I was looking for, look at all the Atoms, and see if the text I
was looking for appears with SNOMED_CT, NCI, MSH, or ICD9CM.
> Note that cTAKES also uses normalized forms of the words in the text being processed,
so if the input text were "lymph nodes" it would match a hypothetical dictionary entry of
"lymph node".
> Also note that intervening words can be OK, up to a limit, but all words within the term
must appear within a single LookupWindow.  
> Hope that is helpful
> -- James
> -----Original Message-----
> From: John Green [mailto:john.travis.green@gmail.com] 
> Sent: Thursday, July 17, 2014 12:57 PM
> To: dev@ctakes.apache.org
> Cc: dev@ctakes.apache.org
> Subject: RE: Procedure
> I didnt see how it appeared in dictionary, I just looked at the cui in umls, which has
it as abdominal pulsatile mass, which isnt the same order as the text I annotated in ctakes
(pulsatile abdominal mass); but if im wrong great, it does raise the question even more why
if it was in the lookup window and in the dictionary that it was only annotated as abdominal
mass.
> Apropos temperature and heart rate, the results of these are measurements right? But
it seems also that they should be procedures in the sense that you perform a physical manipulation
on a pt. If I were checking notes for the presence of whether or not someone checked vitals
vs obtaining the measurements, this seems within the current use case, but Im so often wrong
here being so new... 
> JG
> —
> Sent from Mailbox for iPhone
> On Thu, Jul 17, 2014 at 1:44 PM, Masanz, James J. <Masanz.James@mayo.edu>
> wrote:
>> In general cTAKES doesn't pick up things with values, such as weight, height, lab
values, temperature, with the exception that the drug ner pipeline can pick up medication
related values such as dose, strength, etc.
>> cTAKES does pick up a few things as MeasurementAnnotation just by pattern, but doesn't
associate those with a named entity that has a cui.
>> The example of "pulsatile abdominal mass" listed the same 3 words in the same order
for the dictionary entry and the text that was processed, so I'm not clear what you meant
about word order.
>> -----Original Message-----
>> From: John Green [mailto:john.travis.green@gmail.com] 
>> Sent: Thursday, July 17, 2014 8:04 AM
>> To: dev@ctakes.apache.org
>> Subject: Re: Procedure
>> General so that I dont keep generating work for others :-)
>> Specifically: Temperature wasnt annotated, neither was Heart rate, for
>> example.
>> different but related: it picked up "abdominal mass" (C0000734) but not
>> "pulsatile abdominal mass" (C0266835) when given "pulsatile abdominal
>> mass". I understand that this may be expected given the word order. If it
>> wasnt, then the concern, of course, is that by clinical intuition abdominal
>> mass isnt very specific and one wouldnt jump to thinking AAA. However,
>> pulsatile abdominal mass you would immediately think AAA. While this delta
>> is fairly well reflected in ytex's semantic similarity measure
>> (particularly LCH) with the distance being 0.84 and 0.64 for abdominal mass
>> to pulsatile abdominal mass and Abdominal Aortic Aneurysm (C0162871)
>> respectively.
>> Pulsatile abdominal mass was in the lookup window.
>> JG
>> On Wed, Jul 16, 2014 at 3:07 PM, Masanz, James J. <Masanz.James@mayo.edu>
>> wrote:
>>>
>>> It depends on the type of annotation.
>>>
>>> Some are rule-based. Some are machine-learning based (models).  Some are
>>> dictionary dependent.  And some are based on annotations earlier in the
>>> pipeline, and so looking at the part of speech tags within the tokens, for
>>> example, can explain which chunk something appears in, which can explain
>>> why something might not have been annotated as a DiseaseDisorderMention,
>>> for example.
>>>
>>> Are you asking a general question or is there a specific type of
>>> annotation you are most interested in.
>>>
>>> -----Original Message-----
>>> From: John Green [mailto:john.travis.green@gmail.com]
>>> Sent: Wednesday, July 16, 2014 2:01 PM
>>> To: dev@ctakes.apache.org
>>> Subject: Procedure
>>>
>>> Is there a generally accepted procedure for identifying why an annotation
>>> wasnt made?
>>>
>>> JG
>>>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message