ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen, Pei" <Pei.C...@childrens.harvard.edu>
Subject Re: Include the smoking status detection in AggregatePlaintextFastUMLSProcessor.xml
Date Mon, 20 Apr 2015 19:02:29 GMT
Great. There is a redundant Negation step in one of final sub smoking desc xml's. 
Leave the Jira as a placeholder to clean up the smoking status desc's.

Sent from my iPhone

> On Apr 20, 2015, at 1:11 PM, Tom Devel <develxy@gmail.com> wrote:
> 
> Pei,
> 
> I did what you recommended, I run a test input with this new pipeline and
> did a diff with the clinical pipeline without the smoking status on the two
> CAS files. It seems to do the trick, the Umls concept tags are still the
> same, and there is now a new tag for the smoking status annotation, great!
> 
> Before I create the Jira item, what do you mean with removing the last
> NegEx?
> 
> In AggregatePlaintextFastUMLSProcessor, the node of the NegationAnnotator
> is commented out:
> <!-- <node>NegationAnnotator</node> -->
> 
> Did you mean this node?
> 
> At the top of the file, there is an import for the NegationAnnotator:
> <delegateAnalysisEngine key="NegationAnnotator">, but it is not commented
> out and never run in the fixed flow.
> 
> Am I correct that the negation detection in the clinical pipeline is now
> performed by PolarityCleartkAnalysisEngine?
> 
> Thanks,
> Tom
> 
>> On Sat, Apr 18, 2015 at 12:53 AM, Pei Chen <chenpei@apache.org> wrote:
>> 
>> Tom,
>> I would put it at the end of the pipeline (at a min, it should be behind
>> sectionizer, sentence, tokenizer, lvg).  I would remove
>> ExternalBaseAggregateTAE
>> as this simulates the sectionizer, sentence, tokenizer, lvg would would be
>> redundant.  I would also probably remove the last NegEx which could
>> override the assertion values.
>> 
>> Disclaimer: I did not test this yet.  Feel free to open a Jira item if it
>> works for you so it can be tracked.  It seems kind of strange to have a
>> descriptor xml define another xml descriptor to be loaded up via code
>> again- I think this could be simplified.
>> --Pei
>> 
>>> On Thu, Apr 16, 2015 at 7:29 PM, Tom Devel <develxy@gmail.com> wrote:
>>> 
>>> Hi,
>>> 
>>> I am using the smoking status AE from SimulatedProdSmokingTAE.xml, it
>> works
>>> fine, I can see the smoking status annotation in the CVD.
>>> 
>>> Now I would like to include the smoking status detection in the clinical
>>> pipeline of AggregatePlaintextFastUMLSProcessor.xml, so that when I run
>> the
>>> clinincal pipeline, the smoking status will also be determined.
>>> 
>>> How can I do this?
>>> 
>>> I am thinking to just put the nodes from the fixed flow of
>>> SimulatedProdSmokingTAE.xml into the fixed flow of
>>> AggregatePlaintextFastUMLSProcessor.xml, is this the right approach?
>>> 
>>> If so, at which exact place in the clinical pipeline fixed flow should
>>> these nodes be added?
>>> 
>>> Is there a preferred place (such as append after the last node or put
>>> before the first node) ?
>>> 
>>> Can a wrong position or ordering of the smoking status nodes
>> damage/corrupt
>>> the rest of the annotations?
>>> 
>>> SimulatedProdSmokingTAE.xml contains these lines with the fixed flow:
>>> 
>>> <fixedFlow>
>>> <node>ExternalBaseAggregateTAE</node>
>>> <node>SentenceAdjuster</node>
>>> <node>ClassifiableEntriesAnnotator</node>
>>> </fixedFlow>
>>> 
>>> AggregatePlaintextFastUMLSProcessor.xml (3.2.2 from SVN) contains this
>>> fixed flow:
>>> 
>>> <fixedFlow>
>>> <node>SimpleSegmentAnnotator</node>
>>> <node>SentenceDetectorAnnotator</node>
>>> <node>TokenizerAnnotator</node>
>>> <node>LvgAnnotator</node>
>>> <node>ContextDependentTokenizerAnnotator</node>
>>> <node>POSTagger</node>
>>> <!-- <node>ClearPOSTagger</node> -->
>>> <node>Chunker</node>
>>> <node>AdjustNounPhraseToIncludeFollowingNP</node>
>>> <node>AdjustNounPhraseToIncludeFollowingPPNP</node>
>>> <!--<node>LookupWindowAnnotator</node>-->
>>> <node>DictionaryLookupAnnotatorDB</node>
>>> <node>DrugNER</node>
>>> <node>DependencyParser</node>
>>> <node>SemanticRoleLabeler</node>
>>> <node>ConstituencyParser</node>
>>> <!-- <node>AssertionAnnotator</node> -->
>>> <!-- <node>StatusAnnotator</node> -->
>>> <!-- <node>NegationAnnotator</node> -->
>>> <node>GenericCleartkAnalysisEngine</node>
>>> <node>HistoryCleartkAnalysisEngine</node>
>>> <node>PolarityCleartkAnalysisEngine</node>
>>> <node>SubjectCleartkAnalysisEngine</node>
>>> <node>UncertaintyCleartkAnalysisEngine</node>
>>> 
>>> <node>ExtractionPrepAnnotator</node>
>>> </fixedFlow>
>>> 
>>> Thanks for any help or pointers,
>>> 
>>> Tom
>> 

Mime
View raw message