ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Green <john.travis.gr...@gmail.com>
Subject Re: cTakes polarity problem
Date Wed, 31 Dec 2014 23:07:32 GMT
As I was reading this thread I had the same thought as Tim, perhaps a
combination. It seems over the perfect training corpus this wouldnt be
necessary, but perhaps as a stop gap the "ensemble" approach for some using
your training data but working in a diff corpus (not that I really have the
time to write anything here, just spit balling bc its an interesting
thread). Im still bootstrapping myself in ML so I may not have followed
David's reasoning perfectly, but couldn't a simple approach be that
anything that isnt negated by the new algo get passed to negex as a fall
back? I think that was what you were saying Tim.

One area that I can comment on in a more meaningful way would be chiming in
on Tim's remarks regarding the legitimacy of the phrase "Deny hepatitis": I
agree, my clinical intuition says it's an unlikely phrase. More probable
would be it was a typo; "Negative for hepatitis" would be more reasonable
after, say, serology for HepB markers, though strictly speaking this would
be less likely to be in a phrase reporting results of just that specific
test (this would more likely be something a long the lines of "hep panel
negative" or simply "the the labs were unremarkable". However, I could see
this phrase in something like "the std screen was negative for hep but
positive for hiv".

The latter is definitely just one clinical opinion, people talk all kinds
of ways on the wards, good and bad, and it ends up in their notes too.

Best,
JG

On Wed, Dec 31, 2014 at 12:32 PM, David Kincaid <kincaid.dave@gmail.com>
wrote:

> Tim, I like your idea of a hybrid approach. I've thought about trying a
> hybrid approach in the past myself, but haven't had a chance to try it or
> seen any papers on it. It seems you could do it by either treating the
> NegEx output simply as a feature in the ML model or combining the output of
> NegEx and the ML model as an ensemble of sorts. The former would probably
> have the problem of the NegEx "feature" overwhelming any other features
> since it would be right most of the time. If I were doing it I think I'd
> start with the latter approach.
>
> In any event, it seems like right now people will need to see how the two
> systems (NegEx and ML) work on their particular data and go with whichever
> is best.
>
> - Dave
>
> On Wed, Dec 31, 2014 at 10:40 AM, Miller, Timothy <
> Timothy.Miller@childrens.harvard.edu> wrote:
>
> > Hi Michael,
> > I'm somewhat sympathetic to that opinion. But we did a bunch of
> > experiments and it seemed to us that negex was too hand-tailored for a
> > specific dataset and that our new module did better across datasets and
> > overall. The tradeoff is that it is harder to improve and it sometimes
> > gives unexpected results on the kind of inputs people input by hand for
> > preliminary testing. That is a tradeoff people will have to consider and
> > like Guergana said, the rule-based module is still part of cTAKES.
> > (FWIW, I believe it is possible to engineer examples that make Negex
> > fail in unintuitive ways as well.) If you are interested in these
> > experiments please check out our paper in Plos One where we look at the
> > difficulty of the polarity problem, specifically porting systems to new
> > domains:
> >
> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0112774
> >
> > I've been wondering if some hybrid approach might be useful. For
> > example, maybe a system that runs the ML module and Negex and adds in
> > all the recalled negated terms that Negex finds over and above the ML.
> > This would probably fix some of the issues with test sentences but does
> > not solve the problem of being hard to debug. Another possibility is
> > using a more transparent ML method like decision trees or something.
> >
> > Tim
> >
> >
> >
> >
> >
> > On 12/31/2014 11:22 AM, Michael J Gurley wrote:
> > > I think this demonstrates that machine learning is not the right
> approach
> > > to the negation/polarity problem.
> > >
> > >
> > > Michael Gurley
> > > m-gurley@northwestern.edu
> > > 312 925 3268
> > > Northwestern University Clinical and Translational Sciences Institute
> > > (NUCATS)
> > > http://www.nucats.northwestern.edu
> > > Rubloff Building
> > > 750 N Lake Shore Drive, 11th Floor
> > > Chicago, IL 60611
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On 12/31/14 9:13 AM, "Miller, Timothy"
> > > <Timothy.Miller@childrens.harvard.edu> wrote:
> > >
> > >> Hi Yu,
> > >>
> > >> The new polarity module is machine-learning based so it is not always
> > >> easy to diagnose accuracy issues. But generally it might mean there
> was
> > >> no example like that in the training data. It was trained on multiple
> > >> corpora, but sometimes certain phrases slip through the cracks, and
> > >> "Deny hepatitis," while possible in the truncated language of clinical
> > >> notes, seems like an unlikely phrase and so it may not be in our data.
> > >> Is that a real example you saw or just a minimum (not) working
> example?
> > >> If not do you have a real example (i.e. a whole sentence) where "deny"
> > >> should cause a negation but does not? If so I will look into it. We
> have
> > >> had a few reports like this so it may be worth keeping track of missed
> > >> examples for future iterations of the module. It is important that
> they
> > >> be real examples "from the wild" though.
> > >>
> > >> (As an aside, machine learning methods don't understand language the
> way
> > >> people do so even if it seems obvious to a human that "Deny
> <disease>."
> > >> should be negated, if it looks different enough from the context of an
> > >> example from the training data the ML will sometimes fall back to the
> > >> majority class of "Not negated".)
> > >>
> > >> Tim
> > >>
> > >>
> > >> On 12/31/2014 10:03 AM, Yu Liang wrote:
> > >>> I have a quick question about CTAKES.
> > >>> I am using AE ³AggregatePlaintextUMLSProcessor.xml² and want to get
> > >>> some negation results by referring to polarity attribute.
> > >>> However, it turns out, for example ³Negative for hepatitis², is not
> > >>> negated. I think it is weird and I tried ³No hepatitis², ³ Denies
> > >>> hepatitis² which return ³polarity= -1², but ³Deny hepatitis.²
returns
> > >>> ³polarity=1².
> > >>>
> > >>> any one could give me some clue that what is wrong? Thank you!
> > >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message