ctakes-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niraj Shrestha <nshrest...@gmail.com>
Subject ctakes concept and relation extraction
Date Tue, 02 Aug 2016 12:13:32 GMT
Dear Sir
I am trying to extract named entities and their relations from medical
document. If I understood correctly concepts are basically entities.
I have used two different analysis engines:
     AggregatePlaintextFastUMLSProcessor.xml for concept extraction and
     RelationExtractorAggregate for relation extraction.

My first question is how can I combined both engine to obtain concept and
relations in single file.

If I understood correctly, If I need to extract all the entities (concepts)
then I need to get all the nodes
"org.apache.ctakes.typesystem.type.refsem.UmlsConcept" from output xml
file. But how can I choose the single entities or concept from list of many
concepts?

and What is FSArray in which all concept ids are listed.

I found some concepts are not mentioned on input data but it appeared in
the output data for example, when I use following engine in "note.txt" file

<import
location="../analysis_engine/AggregatePlaintextFastUMLSProcessor.xml"/>
output file is "note.txt4.xml" (attached here)

One of the concept is following, where "kidney" is mentioned as
preferredText but the word "kidney" is not found in the input data.

<org.apache.ctakes.typesystem.type.refsem.UmlsConcept _id="4503"
codingScheme="SNOMEDCT" code="64033007" oid="64033007#SNOMEDCT" score="0.0"
disambiguated="false" cui="C0022646" tui="T023" preferredText="Kidney"/>
    <org.apache.ctakes.typesystem.type.refsem.UmlsConcept _id="4493"
codingScheme="SNOMEDCT" code="17373004" oid="17373004#SNOMEDCT" score="0.0"
disambiguated="false" cui="C0227665" tui="T023" preferredText="Both
kidneys"/>
    <org.apache.ctakes.typesystem.type.refsem.UmlsConcept _id="4483"
codingScheme="SNOMEDCT" code="181414000" oid="181414000#SNOMEDCT"
score="0.0" disambiguated="false" cui="C1278978" tui="T023"
preferredText="Entire kidney"/>
    <uima.cas.FSArray _id="4513" size="3">
        <i>4483</i>
        <i>4493</i>
        <i>4503</i>
    </uima.cas.FSArray>


************************************
My next query concern with relation extraction for which I use following
engine.

<import
location="../../../ctakes-relation-extractor/desc/analysis_engine/RelationExtractorAggregate.xml"/>
output file is "note.txt_relation.xml" (attached here)

I am not able to interpret the output file (note.txt_relation.xml) in which
relation and their location is mentioned but could not figure out which
entities and what relation between those entities in terms of words.

For eg:

<org.apache.ctakes.typesystem.type.relation.RelationArgument _indexed="1"
_id="12422" id="0" _ref_argument="10680" role="Argument"/>
    <org.apache.ctakes.typesystem.type.relation.RelationArgument
_indexed="1" _id="12427" id="0" _ref_argument="10989" role="Related_to"/>
    <org.apache.ctakes.typesystem.type.relation.RelationArgument
_indexed="1" _id="12446" id="0" _ref_argument="10680" role="Argument"/>
.
.
.
.
<org.apache.ctakes.typesystem.type.relation.RelationArgument _indexed="1"
_id="12851" id="0" _ref_argument="12181" role="Related_to"/>
    <org.apache.ctakes.typesystem.type.relation.LocationOfTextRelation
_indexed="1" _id="12432" id="0" category="location_of"
discoveryTechnique="0" confidence="0.0" polarity="0" uncertainty="0"
conditional="false" _ref_arg1="12422" _ref_arg2="12427"/>
    <org.apache.ctakes.typesystem.type.relation.LocationOfTextRelation
_indexed="1" _id="12456" id="0" category="location_of"
discoveryTechnique="0" confidence="0.0" polarity="0" uncertainty="0"
conditional="false" _ref_arg1="12446" _ref_arg2="12451"/>
    <org.apache.ctakes.typesystem.type.relation.LocationOfTextRelation
_indexed="1" _id="12480" id="0" category="location_of"
discoveryTechnique="0" confidence="0.0" polarity="0" uncertainty="0"
conditional="false" _ref_arg1="12470" _ref_arg2="12475"/>
    <org.apache.ctakes.typesystem.type.relation.LocationOfTextRelation
_indexed="1" _id="12508" id="0" category="location_of"
discoveryTechnique="0" confidence="0.0" polarity="0" uncertainty="0"
conditional="false" _ref_arg1="12498" _ref_arg2="12503"/>


Sorry for long and many queries at once.

Thanks a lot in advance for your suggetions.

With regards,
Shrestha

Mime
View raw message