ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vijay garla <vnga...@gmail.com>
Subject Re: cTakes polarity problem
Date Fri, 02 Jan 2015 10:26:29 GMT
As guergana mentioned ctakes has a rule based negation detection module.
In addition ytex adds a negex based analysis engine.  Both approaches are
very sensitive to sentence splitting (see previous threads on alternative
sentence splitters).

An additional advantage of rule based negation is you don't need some of
the memory & cpu intensive analysis engines required by the ml-based
negation detection ae.

Hth

Vj

On Thursday, January 1, 2015, John Green <john.travis.green@gmail.com>
wrote:

> As I was reading this thread I had the same thought as Tim, perhaps a
> combination. It seems over the perfect training corpus this wouldnt be
> necessary, but perhaps as a stop gap the "ensemble" approach for some using
> your training data but working in a diff corpus (not that I really have the
> time to write anything here, just spit balling bc its an interesting
> thread). Im still bootstrapping myself in ML so I may not have followed
> David's reasoning perfectly, but couldn't a simple approach be that
> anything that isnt negated by the new algo get passed to negex as a fall
> back? I think that was what you were saying Tim.
>
> One area that I can comment on in a more meaningful way would be chiming in
> on Tim's remarks regarding the legitimacy of the phrase "Deny hepatitis": I
> agree, my clinical intuition says it's an unlikely phrase. More probable
> would be it was a typo; "Negative for hepatitis" would be more reasonable
> after, say, serology for HepB markers, though strictly speaking this would
> be less likely to be in a phrase reporting results of just that specific
> test (this would more likely be something a long the lines of "hep panel
> negative" or simply "the the labs were unremarkable". However, I could see
> this phrase in something like "the std screen was negative for hep but
> positive for hiv".
>
> The latter is definitely just one clinical opinion, people talk all kinds
> of ways on the wards, good and bad, and it ends up in their notes too.
>
> Best,
> JG
>
> On Wed, Dec 31, 2014 at 12:32 PM, David Kincaid <kincaid.dave@gmail.com
> <javascript:;>>
> wrote:
>
> > Tim, I like your idea of a hybrid approach. I've thought about trying a
> > hybrid approach in the past myself, but haven't had a chance to try it or
> > seen any papers on it. It seems you could do it by either treating the
> > NegEx output simply as a feature in the ML model or combining the output
> of
> > NegEx and the ML model as an ensemble of sorts. The former would probably
> > have the problem of the NegEx "feature" overwhelming any other features
> > since it would be right most of the time. If I were doing it I think I'd
> > start with the latter approach.
> >
> > In any event, it seems like right now people will need to see how the two
> > systems (NegEx and ML) work on their particular data and go with
> whichever
> > is best.
> >
> > - Dave
> >
> > On Wed, Dec 31, 2014 at 10:40 AM, Miller, Timothy <
> > Timothy.Miller@childrens.harvard.edu <javascript:;>> wrote:
> >
> > > Hi Michael,
> > > I'm somewhat sympathetic to that opinion. But we did a bunch of
> > > experiments and it seemed to us that negex was too hand-tailored for a
> > > specific dataset and that our new module did better across datasets and
> > > overall. The tradeoff is that it is harder to improve and it sometimes
> > > gives unexpected results on the kind of inputs people input by hand for
> > > preliminary testing. That is a tradeoff people will have to consider
> and
> > > like Guergana said, the rule-based module is still part of cTAKES.
> > > (FWIW, I believe it is possible to engineer examples that make Negex
> > > fail in unintuitive ways as well.) If you are interested in these
> > > experiments please check out our paper in Plos One where we look at the
> > > difficulty of the polarity problem, specifically porting systems to new
> > > domains:
> > >
> >
> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0112774
> > >
> > > I've been wondering if some hybrid approach might be useful. For
> > > example, maybe a system that runs the ML module and Negex and adds in
> > > all the recalled negated terms that Negex finds over and above the ML.
> > > This would probably fix some of the issues with test sentences but does
> > > not solve the problem of being hard to debug. Another possibility is
> > > using a more transparent ML method like decision trees or something.
> > >
> > > Tim
> > >
> > >
> > >
> > >
> > >
> > > On 12/31/2014 11:22 AM, Michael J Gurley wrote:
> > > > I think this demonstrates that machine learning is not the right
> > approach
> > > > to the negation/polarity problem.
> > > >
> > > >
> > > > Michael Gurley
> > > > m-gurley@northwestern.edu <javascript:;>
> > > > 312 925 3268
> > > > Northwestern University Clinical and Translational Sciences Institute
> > > > (NUCATS)
> > > > http://www.nucats.northwestern.edu
> > > > Rubloff Building
> > > > 750 N Lake Shore Drive, 11th Floor
> > > > Chicago, IL 60611
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On 12/31/14 9:13 AM, "Miller, Timothy"
> > > > <Timothy.Miller@childrens.harvard.edu <javascript:;>> wrote:
> > > >
> > > >> Hi Yu,
> > > >>
> > > >> The new polarity module is machine-learning based so it is not
> always
> > > >> easy to diagnose accuracy issues. But generally it might mean there
> > was
> > > >> no example like that in the training data. It was trained on
> multiple
> > > >> corpora, but sometimes certain phrases slip through the cracks, and
> > > >> "Deny hepatitis," while possible in the truncated language of
> clinical
> > > >> notes, seems like an unlikely phrase and so it may not be in our
> data.
> > > >> Is that a real example you saw or just a minimum (not) working
> > example?
> > > >> If not do you have a real example (i.e. a whole sentence) where
> "deny"
> > > >> should cause a negation but does not? If so I will look into it. We
> > have
> > > >> had a few reports like this so it may be worth keeping track of
> missed
> > > >> examples for future iterations of the module. It is important that
> > they
> > > >> be real examples "from the wild" though.
> > > >>
> > > >> (As an aside, machine learning methods don't understand language the
> > way
> > > >> people do so even if it seems obvious to a human that "Deny
> > <disease>."
> > > >> should be negated, if it looks different enough from the context of
> an
> > > >> example from the training data the ML will sometimes fall back to
> the
> > > >> majority class of "Not negated".)
> > > >>
> > > >> Tim
> > > >>
> > > >>
> > > >> On 12/31/2014 10:03 AM, Yu Liang wrote:
> > > >>> I have a quick question about CTAKES.
> > > >>> I am using AE ³AggregatePlaintextUMLSProcessor.xml² and want
to get
> > > >>> some negation results by referring to polarity attribute.
> > > >>> However, it turns out, for example ³Negative for hepatitis²,
is not
> > > >>> negated. I think it is weird and I tried ³No hepatitis², ³
Denies
> > > >>> hepatitis² which return ³polarity= -1², but ³Deny hepatitis.²
> returns
> > > >>> ³polarity=1².
> > > >>>
> > > >>> any one could give me some clue that what is wrong? Thank you!
> > > >
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message