ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Kincaid <kincaid.d...@gmail.com>
Subject Re: cTakes polarity problem
Date Wed, 31 Dec 2014 17:32:53 GMT
Tim, I like your idea of a hybrid approach. I've thought about trying a
hybrid approach in the past myself, but haven't had a chance to try it or
seen any papers on it. It seems you could do it by either treating the
NegEx output simply as a feature in the ML model or combining the output of
NegEx and the ML model as an ensemble of sorts. The former would probably
have the problem of the NegEx "feature" overwhelming any other features
since it would be right most of the time. If I were doing it I think I'd
start with the latter approach.

In any event, it seems like right now people will need to see how the two
systems (NegEx and ML) work on their particular data and go with whichever
is best.

- Dave

On Wed, Dec 31, 2014 at 10:40 AM, Miller, Timothy <
Timothy.Miller@childrens.harvard.edu> wrote:

> Hi Michael,
> I'm somewhat sympathetic to that opinion. But we did a bunch of
> experiments and it seemed to us that negex was too hand-tailored for a
> specific dataset and that our new module did better across datasets and
> overall. The tradeoff is that it is harder to improve and it sometimes
> gives unexpected results on the kind of inputs people input by hand for
> preliminary testing. That is a tradeoff people will have to consider and
> like Guergana said, the rule-based module is still part of cTAKES.
> (FWIW, I believe it is possible to engineer examples that make Negex
> fail in unintuitive ways as well.) If you are interested in these
> experiments please check out our paper in Plos One where we look at the
> difficulty of the polarity problem, specifically porting systems to new
> domains:
> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0112774
> I've been wondering if some hybrid approach might be useful. For
> example, maybe a system that runs the ML module and Negex and adds in
> all the recalled negated terms that Negex finds over and above the ML.
> This would probably fix some of the issues with test sentences but does
> not solve the problem of being hard to debug. Another possibility is
> using a more transparent ML method like decision trees or something.
> Tim
> On 12/31/2014 11:22 AM, Michael J Gurley wrote:
> > I think this demonstrates that machine learning is not the right approach
> > to the negation/polarity problem.
> >
> >
> > Michael Gurley
> > m-gurley@northwestern.edu
> > 312 925 3268
> > Northwestern University Clinical and Translational Sciences Institute
> > (NUCATS)
> > http://www.nucats.northwestern.edu
> > Rubloff Building
> > 750 N Lake Shore Drive, 11th Floor
> > Chicago, IL 60611
> >
> >
> >
> >
> >
> >
> >
> > On 12/31/14 9:13 AM, "Miller, Timothy"
> > <Timothy.Miller@childrens.harvard.edu> wrote:
> >
> >> Hi Yu,
> >>
> >> The new polarity module is machine-learning based so it is not always
> >> easy to diagnose accuracy issues. But generally it might mean there was
> >> no example like that in the training data. It was trained on multiple
> >> corpora, but sometimes certain phrases slip through the cracks, and
> >> "Deny hepatitis," while possible in the truncated language of clinical
> >> notes, seems like an unlikely phrase and so it may not be in our data.
> >> Is that a real example you saw or just a minimum (not) working example?
> >> If not do you have a real example (i.e. a whole sentence) where "deny"
> >> should cause a negation but does not? If so I will look into it. We have
> >> had a few reports like this so it may be worth keeping track of missed
> >> examples for future iterations of the module. It is important that they
> >> be real examples "from the wild" though.
> >>
> >> (As an aside, machine learning methods don't understand language the way
> >> people do so even if it seems obvious to a human that "Deny <disease>."
> >> should be negated, if it looks different enough from the context of an
> >> example from the training data the ML will sometimes fall back to the
> >> majority class of "Not negated".)
> >>
> >> Tim
> >>
> >>
> >> On 12/31/2014 10:03 AM, Yu Liang wrote:
> >>> I have a quick question about CTAKES.
> >>> I am using AE ³AggregatePlaintextUMLSProcessor.xml² and want to get
> >>> some negation results by referring to polarity attribute.
> >>> However, it turns out, for example ³Negative for hepatitis², is not
> >>> negated. I think it is weird and I tried ³No hepatitis², ³ Denies
> >>> hepatitis² which return ³polarity= -1², but ³Deny hepatitis.² returns
> >>> ³polarity=1².
> >>>
> >>> any one could give me some clue that what is wrong? Thank you!
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message