ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miller, Timothy" <Timothy.Mil...@childrens.harvard.edu>
Subject Re: cTakes polarity problem
Date Wed, 31 Dec 2014 16:40:08 GMT
Hi Michael,
I'm somewhat sympathetic to that opinion. But we did a bunch of
experiments and it seemed to us that negex was too hand-tailored for a
specific dataset and that our new module did better across datasets and
overall. The tradeoff is that it is harder to improve and it sometimes
gives unexpected results on the kind of inputs people input by hand for
preliminary testing. That is a tradeoff people will have to consider and
like Guergana said, the rule-based module is still part of cTAKES.
(FWIW, I believe it is possible to engineer examples that make Negex
fail in unintuitive ways as well.) If you are interested in these
experiments please check out our paper in Plos One where we look at the
difficulty of the polarity problem, specifically porting systems to new
domains:
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0112774

I've been wondering if some hybrid approach might be useful. For
example, maybe a system that runs the ML module and Negex and adds in
all the recalled negated terms that Negex finds over and above the ML.
This would probably fix some of the issues with test sentences but does
not solve the problem of being hard to debug. Another possibility is
using a more transparent ML method like decision trees or something.

Tim





On 12/31/2014 11:22 AM, Michael J Gurley wrote:
> I think this demonstrates that machine learning is not the right approach
> to the negation/polarity problem.
>
>
> Michael Gurley
> m-gurley@northwestern.edu
> 312 925 3268
> Northwestern University Clinical and Translational Sciences Institute
> (NUCATS)
> http://www.nucats.northwestern.edu
> Rubloff Building
> 750 N Lake Shore Drive, 11th Floor
> Chicago, IL 60611
>
>
>
>
>
>
>
> On 12/31/14 9:13 AM, "Miller, Timothy"
> <Timothy.Miller@childrens.harvard.edu> wrote:
>
>> Hi Yu,
>>
>> The new polarity module is machine-learning based so it is not always
>> easy to diagnose accuracy issues. But generally it might mean there was
>> no example like that in the training data. It was trained on multiple
>> corpora, but sometimes certain phrases slip through the cracks, and
>> "Deny hepatitis," while possible in the truncated language of clinical
>> notes, seems like an unlikely phrase and so it may not be in our data.
>> Is that a real example you saw or just a minimum (not) working example?
>> If not do you have a real example (i.e. a whole sentence) where "deny"
>> should cause a negation but does not? If so I will look into it. We have
>> had a few reports like this so it may be worth keeping track of missed
>> examples for future iterations of the module. It is important that they
>> be real examples "from the wild" though.
>>
>> (As an aside, machine learning methods don't understand language the way
>> people do so even if it seems obvious to a human that "Deny <disease>."
>> should be negated, if it looks different enough from the context of an
>> example from the training data the ML will sometimes fall back to the
>> majority class of "Not negated".)
>>
>> Tim
>>
>>
>> On 12/31/2014 10:03 AM, Yu Liang wrote:
>>> I have a quick question about CTAKES.
>>> I am using AE ³AggregatePlaintextUMLSProcessor.xml² and want to get
>>> some negation results by referring to polarity attribute.
>>> However, it turns out, for example ³Negative for hepatitis², is not
>>> negated. I think it is weird and I tried ³No hepatitis², ³ Denies
>>> hepatitis² which return ³polarity= -1², but ³Deny hepatitis.² returns
>>> ³polarity=1².
>>>
>>> any one could give me some clue that what is wrong? Thank you!
>


Mime
View raw message