ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pei Chen <chen...@apache.org>
Subject Re: Include the smoking status detection in AggregatePlaintextFastUMLSProcessor.xml
Date Tue, 21 Apr 2015 17:31:22 GMT
If it works for you, I would keep it in there then.  Leave the info in the
Jira and we should double check the code that piece of negation is only
used for the smoking status types.
--Pei

On Tue, Apr 21, 2015 at 1:04 PM, Tom Devel <develxy@gmail.com> wrote:

> After further testing, removing the <node>NegationAnnotator</node> step in
>
> ProductionPostSentenceAggregate_step2_libsvm.xml (which I assume is the sub
> smoking desc xml you mean), the smoking status is not correctly classified
> anymore when negations are there, so this step does not look redundant to
> me.
>
>
> For example, "He denied use of tobacco" is then classified as
> CURRENT_SMOKER. If I leave this negation step in, it is correctly found as
> NON_SMOKER.
>
>
> I tried changing the order in which the smoking status nodes
> <node>SentenceAdjuster</node> and <node>ClassifiableEntriesAnnotator</node>
> are run in the clinical pipeline, putting them directly after lvg or at the
> end of the flow does not change the observation above.
>
>
> However, you said that leaving the NegationAnnotator in could overwrite
> assertion values, how can this be prevented while keeping correct smoking
> status classifications?
>
> On Mon, Apr 20, 2015 at 2:02 PM, Chen, Pei <Pei.Chen@childrens.harvard.edu
> >
> wrote:
>
> > Great. There is a redundant Negation step in one of final sub smoking
> desc
> > xml's.
> > Leave the Jira as a placeholder to clean up the smoking status desc's.
> >
> > Sent from my iPhone
> >
> > > On Apr 20, 2015, at 1:11 PM, Tom Devel <develxy@gmail.com> wrote:
> > >
> > > Pei,
> > >
> > > I did what you recommended, I run a test input with this new pipeline
> and
> > > did a diff with the clinical pipeline without the smoking status on the
> > two
> > > CAS files. It seems to do the trick, the Umls concept tags are still
> the
> > > same, and there is now a new tag for the smoking status annotation,
> > great!
> > >
> > > Before I create the Jira item, what do you mean with removing the last
> > > NegEx?
> > >
> > > In AggregatePlaintextFastUMLSProcessor, the node of the
> NegationAnnotator
> > > is commented out:
> > > <!-- <node>NegationAnnotator</node> -->
> > >
> > > Did you mean this node?
> > >
> > > At the top of the file, there is an import for the NegationAnnotator:
> > > <delegateAnalysisEngine key="NegationAnnotator">, but it is not
> commented
> > > out and never run in the fixed flow.
> > >
> > > Am I correct that the negation detection in the clinical pipeline is
> now
> > > performed by PolarityCleartkAnalysisEngine?
> > >
> > > Thanks,
> > > Tom
> > >
> > >> On Sat, Apr 18, 2015 at 12:53 AM, Pei Chen <chenpei@apache.org>
> wrote:
> > >>
> > >> Tom,
> > >> I would put it at the end of the pipeline (at a min, it should be
> behind
> > >> sectionizer, sentence, tokenizer, lvg).  I would remove
> > >> ExternalBaseAggregateTAE
> > >> as this simulates the sectionizer, sentence, tokenizer, lvg would
> would
> > be
> > >> redundant.  I would also probably remove the last NegEx which could
> > >> override the assertion values.
> > >>
> > >> Disclaimer: I did not test this yet.  Feel free to open a Jira item if
> > it
> > >> works for you so it can be tracked.  It seems kind of strange to have
> a
> > >> descriptor xml define another xml descriptor to be loaded up via code
> > >> again- I think this could be simplified.
> > >> --Pei
> > >>
> > >>> On Thu, Apr 16, 2015 at 7:29 PM, Tom Devel <develxy@gmail.com>
> wrote:
> > >>>
> > >>> Hi,
> > >>>
> > >>> I am using the smoking status AE from SimulatedProdSmokingTAE.xml,
it
> > >> works
> > >>> fine, I can see the smoking status annotation in the CVD.
> > >>>
> > >>> Now I would like to include the smoking status detection in the
> > clinical
> > >>> pipeline of AggregatePlaintextFastUMLSProcessor.xml, so that when I
> run
> > >> the
> > >>> clinincal pipeline, the smoking status will also be determined.
> > >>>
> > >>> How can I do this?
> > >>>
> > >>> I am thinking to just put the nodes from the fixed flow of
> > >>> SimulatedProdSmokingTAE.xml into the fixed flow of
> > >>> AggregatePlaintextFastUMLSProcessor.xml, is this the right approach?
> > >>>
> > >>> If so, at which exact place in the clinical pipeline fixed flow
> should
> > >>> these nodes be added?
> > >>>
> > >>> Is there a preferred place (such as append after the last node or put
> > >>> before the first node) ?
> > >>>
> > >>> Can a wrong position or ordering of the smoking status nodes
> > >> damage/corrupt
> > >>> the rest of the annotations?
> > >>>
> > >>> SimulatedProdSmokingTAE.xml contains these lines with the fixed flow:
> > >>>
> > >>> <fixedFlow>
> > >>> <node>ExternalBaseAggregateTAE</node>
> > >>> <node>SentenceAdjuster</node>
> > >>> <node>ClassifiableEntriesAnnotator</node>
> > >>> </fixedFlow>
> > >>>
> > >>> AggregatePlaintextFastUMLSProcessor.xml (3.2.2 from SVN) contains
> this
> > >>> fixed flow:
> > >>>
> > >>> <fixedFlow>
> > >>> <node>SimpleSegmentAnnotator</node>
> > >>> <node>SentenceDetectorAnnotator</node>
> > >>> <node>TokenizerAnnotator</node>
> > >>> <node>LvgAnnotator</node>
> > >>> <node>ContextDependentTokenizerAnnotator</node>
> > >>> <node>POSTagger</node>
> > >>> <!-- <node>ClearPOSTagger</node> -->
> > >>> <node>Chunker</node>
> > >>> <node>AdjustNounPhraseToIncludeFollowingNP</node>
> > >>> <node>AdjustNounPhraseToIncludeFollowingPPNP</node>
> > >>> <!--<node>LookupWindowAnnotator</node>-->
> > >>> <node>DictionaryLookupAnnotatorDB</node>
> > >>> <node>DrugNER</node>
> > >>> <node>DependencyParser</node>
> > >>> <node>SemanticRoleLabeler</node>
> > >>> <node>ConstituencyParser</node>
> > >>> <!-- <node>AssertionAnnotator</node> -->
> > >>> <!-- <node>StatusAnnotator</node> -->
> > >>> <!-- <node>NegationAnnotator</node> -->
> > >>> <node>GenericCleartkAnalysisEngine</node>
> > >>> <node>HistoryCleartkAnalysisEngine</node>
> > >>> <node>PolarityCleartkAnalysisEngine</node>
> > >>> <node>SubjectCleartkAnalysisEngine</node>
> > >>> <node>UncertaintyCleartkAnalysisEngine</node>
> > >>>
> > >>> <node>ExtractionPrepAnnotator</node>
> > >>> </fixedFlow>
> > >>>
> > >>> Thanks for any help or pointers,
> > >>>
> > >>> Tom
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message