ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomasz Oliwa <ol...@uchicago.edu>
Subject RE: Allergy Annotator
Date Fri, 17 Jul 2015 19:02:55 GMT

I am interested in the design decision of the sentence detector. 

Why does it split a sentence of the form "WORD1: WORD2 WORD3." into two sentences  "WORD1:"
and "WORD2 WORD3."? Do other components of cTAKES require such a sentence splitting?

It would seem to me that it should remain one sentence. For example, the smoking status detector
has its own SentenceAdjuster that merges some of such sentences back into one, because of
this design.


From: Finan, Sean [Sean.Finan@childrens.harvard.edu]
Sent: Friday, July 10, 2015 3:20 PM
To: dev@ctakes.apache.org
Subject: RE: Allergy Annotator

Hi Tom,

It is exactly because the sentence detector splits "KEY:" from "VALUE" that I didn't suggest
using sentences.  Instead, I would just iterate over the whole cas collection of medication
events and attempt to match allergy phrases  ("allergic to medication") with text the note
spanning from event.begin-15 to event.end+15 or whatever window size you prefer.


-----Original Message-----
From: Tom Devel [mailto:develxy@gmail.com]
Sent: Friday, July 10, 2015 4:12 PM
To: dev@ctakes.apache.org
Subject: Re: Allergy Annotator

Sean and Dima, these are great suggestions, thanks so far.

Sean, when looping over medication events as you say, I can see how it is possible to take
the textspan.Sentence of this MedicationMention, and then do a regex check for the phrase
structure as Dima said.

But instead of textspan.Sentence, you mention "see any is included in a phrase". What cTAKES/UIMA
class is related to this?

Because if I would use textspan.Sentence, it would work for "The patient is allergic to penicillin.",
into two sentences, so that the MedicationMentions here would not be in the same sentence
as the word "ALLERGIES".

Thanks again,

On Fri, Jul 10, 2015 at 2:12 PM, Finan, Sean < Sean.Finan@childrens.harvard.edu> wrote:

> Hi Dima, Tom,
> I was thinking the same as Dima's first solution.  Iterate through the
> medication events and see any is included in a phrase as mentioned in
> Tom's original email.  Each phrase structure would have to be
> specified beforehand.  However, assigning appropriate CUIs would
> require having a lookup table for each medication allergy.  I think
> that would be the simplest solution.
> Sean
> -----Original Message-----
> From: Dligach, Dmitriy [mailto:Dmitriy.Dligach@childrens.harvard.edu]
> Sent: Friday, July 10, 2015 2:50 PM
> To: cTAKES Developer list
> Subject: Re: Allergy Annotator
> Hi Tom,
> If the patters are pretty simple, you could just add a few rules on
> top of the cTAKES dictionary lookup output. Something of the kind
> "allergic to <medication>" or "allergies: <medication1>,
> <medication2>, <substance1>, ...".
> If these patterns are hard to express as rules, you should consider a
> machine learning based sequence labeling route (e.g. something similar
> to the cTAKES chunker).
> Dima
> --
> Dmitriy (Dima) Dligach, Ph.D.
> Boston Children's Hospital and Harvard Medical School
> (617) 651-0397
> On Jul 10, 2015, at 13:40, Tom Devel <develxy@gmail.com<mailto:
> develxy@gmail.com>> wrote:
> Sean,
> It would be a wider net, such that if an allergy is mentioned in the
> clinical note, this is captured in the corresponding
> IdentifiedAnnotation (or alternatively, if the IdentifiedAnnotation
> class should not be changed with a new attribute, in a separate allergy annotation).
> This annotator would then have to of course run after the clinical
> pipeline has run and discovered all IdentifiedAnnotations.
> I am familiar with writing UIMA/cTAKES annotators, but not sure how a
> new ML method could be integrated here for detecting allergies. Do you
> have any thoughts about how to approach this in general?
> Thanks,
> Tom
> On Fri, Jul 10, 2015 at 11:54 AM, Finan, Sean <
> Sean.Finan@childrens.harvard.edu<mailto:Sean.Finan@childrens.harvard.e
> du>>
> wrote:
> Hi Tom,
> Are you interested in catching all allergies or just a few specific
> allergies for a study?  If you are only concerned with a few then
> there is a (possibly) simple solution.  If you are interested in
> throwing a wider net then I think that a new module would need to be
> created; does anybody reading this have an ML or regex style module?
> Sean
> -----Original Message-----
> From: Tom Devel [mailto:develxy@gmail.com]
> Sent: Friday, July 10, 2015 12:42 PM
> To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
> Subject: Allergy Annotator
> Hi,
> I would like to use/extend cTAKES to detect allergies.
> In the cTAKES publication (2010)
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ncbi.nlm.nih.g
> ov_pmc_articles_PMC2995668_&d=BQIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZM
> SdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=ZApJmGKjz
> vFfNco5rRFVwSIyxmg4MRsxakfuXHbMZME&s=mGWu0XBCJqG2MI5qPlwIpGbQL5IYe7t5E
> WcvhPYW7Lo&e=
> there is the mention
> that: "Allergies to a given medication are handled by setting the
> negation attribute of that medication to 'is negated'."
> However, in a post here in 2014 (RE: Allergy Indication) it is said
> that cTAKES does not have a module for allergy discovery.
> 1. What is the current status of allergy detection in cTAKES?
> 2. I did some testing, while cTAKES discovers concepts about allegies
> ("wheat allergy" is found as C0949570), using "ALLERGIES:  PENICILLIN,
> WHEAT" or "The patient is allergic to penicillin." does not give
> penicillin or wheat annotations allergy status.
> How would I go about detecting these allergy mentions?
> Thanks,
> Tom

View raw message