ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pei Chen <chen...@apache.org>
Subject Re: Include the smoking status detection in AggregatePlaintextFastUMLSProcessor.xml
Date Sat, 18 Apr 2015 05:53:05 GMT
Tom,
I would put it at the end of the pipeline (at a min, it should be behind
sectionizer, sentence, tokenizer, lvg).  I would remove
ExternalBaseAggregateTAE
as this simulates the sectionizer, sentence, tokenizer, lvg would would be
redundant.  I would also probably remove the last NegEx which could
override the assertion values.

Disclaimer: I did not test this yet.  Feel free to open a Jira item if it
works for you so it can be tracked.  It seems kind of strange to have a
descriptor xml define another xml descriptor to be loaded up via code
again- I think this could be simplified.
--Pei

On Thu, Apr 16, 2015 at 7:29 PM, Tom Devel <develxy@gmail.com> wrote:

> Hi,
>
> I am using the smoking status AE from SimulatedProdSmokingTAE.xml, it works
> fine, I can see the smoking status annotation in the CVD.
>
> Now I would like to include the smoking status detection in the clinical
> pipeline of AggregatePlaintextFastUMLSProcessor.xml, so that when I run the
> clinincal pipeline, the smoking status will also be determined.
>
> How can I do this?
>
> I am thinking to just put the nodes from the fixed flow of
> SimulatedProdSmokingTAE.xml into the fixed flow of
> AggregatePlaintextFastUMLSProcessor.xml, is this the right approach?
>
> If so, at which exact place in the clinical pipeline fixed flow should
> these nodes be added?
>
> Is there a preferred place (such as append after the last node or put
> before the first node) ?
>
> Can a wrong position or ordering of the smoking status nodes damage/corrupt
> the rest of the annotations?
>
> SimulatedProdSmokingTAE.xml contains these lines with the fixed flow:
>
> <fixedFlow>
> <node>ExternalBaseAggregateTAE</node>
> <node>SentenceAdjuster</node>
> <node>ClassifiableEntriesAnnotator</node>
> </fixedFlow>
>
> AggregatePlaintextFastUMLSProcessor.xml (3.2.2 from SVN) contains this
> fixed flow:
>
> <fixedFlow>
> <node>SimpleSegmentAnnotator</node>
> <node>SentenceDetectorAnnotator</node>
> <node>TokenizerAnnotator</node>
> <node>LvgAnnotator</node>
> <node>ContextDependentTokenizerAnnotator</node>
> <node>POSTagger</node>
> <!-- <node>ClearPOSTagger</node>  -->
> <node>Chunker</node>
> <node>AdjustNounPhraseToIncludeFollowingNP</node>
> <node>AdjustNounPhraseToIncludeFollowingPPNP</node>
> <!--<node>LookupWindowAnnotator</node>-->
> <node>DictionaryLookupAnnotatorDB</node>
> <node>DrugNER</node>
> <node>DependencyParser</node>
> <node>SemanticRoleLabeler</node>
> <node>ConstituencyParser</node>
> <!-- <node>AssertionAnnotator</node> -->
> <!-- <node>StatusAnnotator</node> -->
> <!-- <node>NegationAnnotator</node> -->
> <node>GenericCleartkAnalysisEngine</node>
> <node>HistoryCleartkAnalysisEngine</node>
> <node>PolarityCleartkAnalysisEngine</node>
> <node>SubjectCleartkAnalysisEngine</node>
> <node>UncertaintyCleartkAnalysisEngine</node>
>
> <node>ExtractionPrepAnnotator</node>
> </fixedFlow>
>
> Thanks for any help or pointers,
>
> Tom
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message