incubator-ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wu, Stephen T., Ph.D." <Wu.Step...@mayo.edu>
Subject Re: type system changes needed to read SHARP data
Date Mon, 26 Nov 2012 20:28:33 GMT
Thanks for all your work, Steve.

> * I couldn't find an entity type for "Clinical_attribute", "Devices", "Lab",
> "Phenomena"
"Devices" and "Phenomena" don't exist yet because they were not part of the
CEM models.  I need input from someone on CEMs if we're to add these.

"Clinical_attribute" -- is this what you're looking for:
 org.apache.ctakes.typesystem.type.refsem.Attribute
It inherits from Element.  If you see something other than known subtypes
(e.g., BodyLoaterality, BodySide, Course, Severity, Procedure*, Lab*,
Medication*), we should probably extend.

Lab should be at org.apache.ctakes.typesystem.type.refsem.Lab


> * I couldn't find a modifier type (or alternatively, an Annotation subclass)
> for the Knowtator annotations "generic_class", "conditional_class",
> "uncertainty_indicator_class", "distal_or_proximal", "Person",
> "negation_indicator_class", "historyOf_indicator_class",
> "superior_or_inferior", "medial_or_lateral", "dorsal_or_ventral",
> "method_class", "device_class", "allergy_indicator_class", "Route", "Form",
> "Strength", "Strength number", "Strength unit", "Frequency", "Frequency
> number", "Frequency unit", "Value", "Value number", "Value unit",
> "estimated_flag_indicator", "reference_range", "Date", "Status change",
> "Duration", "Dosage".
Use the type org.apache.ctakes.typesystem.type.textsem.Modifier with the
"category" feature.

> * I couldn't find a place for the normalized value of
"generic_class", --> IdentifiedAnnotation:generic
"conditional_class",  --> IdentifiedAnnotation:conditionl
"uncertainty_indicator_class", --> IdentifiedAnnotation:uncertainty
"negation_indicator_class",  --> IdentifiedAnnotation:polarity
"distal_or_proximal", --> BodyLaterality:value
"superior_or_inferior", --> BodyLaterality:value
"dorsal_or_ventral", --> BodyLaterality:value
"medial_or_lateral", --> BodyLaterality:value
"device_class", --> ProcedureDevice:value
"Person", --> Entity
"allergy_indicator_class", --> ?
"lab_interpretation_indicator", --> ?
"estimated_flag_indicator"--> ?

Value should be set according the constants in
src/main/java/org/apache/ctakes/typesystem/type/constants/CONST.java

These are to my best estimation.  We may need to add the three question
marks, plus things like "Device"... But let's hold off on that for now.

> * I couldn't find a place for the "associatedCode" of a "Person" or
> "historyOf_indicator_class"
> * There were several things in the Knowtator annotations that I couldn't even
> guess what they meant: "Attributes_lab", "Temporal", ":THING", "Entities".
Attributes_lab should probably be housed in include Attributes value number,
reference range, delta flag, ordinal.

Someone else who knows the annotation schema (e.g., Guergana) needs to weigh
in on this.  I'm not sure what most of the rest of these are intended to be.

> After working with this data I think we should consider having separate UIMA
> Annotation sub-types for each of the things that are Modifiers now. For
> example, if we have a real Severity Annotation for textual mentions of
> severity, then the CAS makes it easy to select these. We have exactly this use
> case in relation extractor - we need just the Severity modifiers, excluding
> all the other modifiers. Basically, I think the principle we should follow in
> UIMA is:
> 
> "If you could imagine searching the CAS for something, then that something
> should have it's own Annotation sub-type."
> 
It's a good point, and a relatively good principle, but we have decided
against it in the past.  The reason is a countering principle:

 "Do not put locally used (component-specific) types in the CAS."
There is no garbage collection in UIMA (despite things being deleted from
the index) and extra types will bloat the CAS system, though admittedly is
not too terrible a bloating.

Currently, the idea would be to create local objects or types.

Another reason for this is that we didn't want to be making changes to the
type system quite so frequently, and anybody can look for something locally
that nobody else cares about -- we shouldn't make full type system changes
for those.

Two doubts that could change my mind:
 1) Do we envision evaluation of the Modifiers/attributes -- apart from the
Named Entities they're attached to?  If so, we need to preserve this
information right at the beginning.
 2) Will these modifiers be reusable downstream?

stephen



Mime
View raw message