ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Masanz, James J." <Masanz.Ja...@mayo.edu>
Subject RE: What is the correct way to enable cTakes to recognize and annotate mentions of medical devices?
Date Sun, 03 May 2015 11:59:53 GMT
I think the 'right way' would be to create a subtype of org.apache.ctakes.typesystem.type.textsem.EntityMention
(rather than what you tried with a subtype of IdentifiedAnnotation)
I don't know where all the required changes would be beyond that.

If you get that to work, please create a JIRA with the information so the documentation can
be updated.

If I just wanted to get it to work, I would look at one of these options
 - use the method you tried with EntityMention, and if you want to get just the devices, loop
through all EntityMention and check explicitly if the type is EntityMention (and not a subtype
of EntityMention)
 - use the method you tried with either ProcedureMention or EntityMention, and then in cases
where you needed to know if it was a device, use the _ref_ontologyConceptArr to find out the
TUI associated with the annotation.


Regards,
James

-----Original Message-----
From: Bruce Tietjen [mailto:bruce.tietjen@perfectsearchcorp.com] 
Sent: Thursday, April 30, 2015 3:39 PM
To: dev@ctakes.apache.org
Subject: What is the correct way to enable cTakes to recognize and annotate mentions of medical
devices?

Hi,
I am using the cTakes 3.2.0 code base and I have been trying to figure out
what would be the proper way to get cTakes to recognize and annotate
mentions of medical devices.

I am using the AggregatePlaintextUMLSProcessor.xml because one of the main
requirements for the annotation is that it needs to to include subject, and
polarity, certainty, etc.

First, I added TUI T074 to the procedureTuis in the LookupDesc_Db.xml. The
output from this provides mostly what I want, except it lumps the devices
as ProcedureMention where I would like them to be distinguished by their
own annotation.

An example annotation created this way was:
    <org.apache.ctakes.typesystem.type.textsem.ProcedureMention
_indexed="1" _id="35574" _ref_sofa="3" begin="472" end="481" id="32"
_ref_ontologyConceptArr="35569" typeID="8" segmentID="SIMPLE_SEGMENT"
discoveryTechnique="1" confidence="1.0" polarity="1" uncertainty="0"
conditional="false" generic="false" subject="patient" historyOf="0"/>

I also tried adding code to classify devices as an EntityMention, and that
seemed to work too:
    <org.apache.ctakes.typesystem.type.textsem.EntityMention _indexed="1"
_id="35574" _ref_sofa="3" begin="472" end="481" id="32"
_ref_ontologyConceptArr="35569" typeID="8" segmentID="SIMPLE_SEGMENT"
discoveryTechnique="1" confidence="1.0" polarity="1" uncertainty="0"
conditional="false" generic="false" subject="patient" historyOf="0"/>

Again, that doesn't give devices their own unique annotation so I looked
further. Exploring the typesystem, I noticed the following types in
TypeSystem.xml:
    org.apache.ctakes.typesystem.type.refsem.ProcedureDevice
    org.apache.ctakes.typesystem.type.textsem.ProcedureDeviceModifier

These seemed like the closest defined types to what I would expect so I
thought I would see if using them would generate what I wanted. I modified
the code to generate these annotations and the result was as follows:
    <org.apache.ctakes.typesystem.type.textsem.ProcedureDeviceModifier
_indexed="1" _id="35587" _ref_sofa="3" begin="472" end="481" id="32"
typeID="0" segmentID="SIMPLE_SEGMENT" discoveryTechnique="0"
confidence="0.0" polarity="0" uncertainty="0" conditional="false"
generic="false" subject="patient" historyOf="0"
_ref_normalizedForm="35574"/>
    <org.apache.ctakes.typesystem.type.refsem.ProcedureDevice _id="35574"
id="0" _ref_ontologyConcept="35542" discoveryTechnique="1" confidence="0.0"
conditional="false" generic="false" polarity="0" uncertainty="0"
historyOf="0"/>

The problem with this approach was that confidence, polarity, and
uncertainty did not get filled in. I tried adding these to the inputs and
outputs of the AssertionMiniPipelineAnalysisEngine, but that didn't seem to
have any effect. Perhaps I didn't do it right? or maybe it isn't even the
right pipeline component to try to modify?

Since ProcedureDevice and ProcedureDeviceModifer have different supertypes
than ProcedureMention, I also tried creating a new type in
TypeSystem.xml:
    <typeDescription>
      <name>org.apache.ctakes.typesystem.type.textsem.DeviceMention</name>

<supertypeName>org.apache.ctakes.typesystem.type.textsem.IdentifiedAnnotation</supertypeName>
      <features>
       <featureDescription>
          <name>entity</name>
          <description/>

<rangeTypeName>org.apache.ctakes.typesystem.type.refsem.Entity</rangeTypeName>
        </featureDescription>
      </features>
    </typeDescription>

Modifying the code to create this type, the result was:
    <org.apache.ctakes.typesystem.type.textsem.DeviceMention _indexed="1"
_id="35574" _ref_sofa="3" begin="472" end="481" id="32"
_ref_ontologyConceptArr="35569" typeID="8" segmentID="SIMPLE_SEGMENT"
discoveryTechnique="1" confidence="0.0" polarity="0" uncertainty="0"
conditional="false" generic="false" subject="patient" historyOf="0"/>

Again the problem here is that confidence, polarity and uncertainty are not
filled in.

So, I am left wondering:
1) Which of the methods I tried would be the best "cTakes way"?
2) What do I need to modify to get confidence, polarity and uncertainty to
be filled in?

Thanks,
Bruce
Mime
View raw message