ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miller, Timothy" <Timothy.Mil...@childrens.harvard.edu>
Subject Re: markable types
Date Mon, 19 May 2014 20:47:08 GMT

On 05/18/2014 07:40 AM, Anirban Chakraborti wrote:
> Timothy,
> 1. so to get concepts of procedure, lab (if any), disease disorder , sign
> symptoms, Anatomical sites , I would need to do
> List<MedicationMention> meds = new ArrayList<>(JCasUtil.select(jcas,
> MedicationMention.class) ;
> List<DiseaseDisorderMention> disease = new
> ArrayList<>(JCasUtil.select(jcas, DiseaseDisorderMention.class);
> List<SignSymptomsMention> signs = new ArrayList<>(JCasUtil.select(jcas,
> SignSymptomMention.class);
> List<AnatomicalMention> anatomy = new ArrayList
> <> (JacsUtil.select(jcas,AnatomicalMention.class);
> List <LabMention> labs = new ArrayList <>
> (JacsUtil.select(jcas,LabMention.class);
> then check the size of the array { meds, disease, signs, anatomy , labs} ,
> print out the array or make a new array using the Java.utils.List or
> Java.utils.Arraylist  package interfaces as the case might me.  Right ...
> 2. I am more interested in the IdentifiedAnnotation class. However there
> are concepts like FractionAnnotation which are not defined enum in the
> const.java. How do I handle them. Do I need to add to the const.java file.
nope, you probably just want EntityMention (for anatomical sites) and
EventMention (for all clinical events, including DiseaseDisorder,
Procedure, SignSymptom, etc.).

> 3. what exactly is the functional difference between say
> MedicationEventMention .java, MedicationMention.java, Medication.java and
> MedicationEventMention_type.java .  I understand similar difference is
> between class of lab, procedure etc...
The types ending in _type.java are UIMA-internal types, you can ignore.
Medication is a referential type -- something in the real world that
could be referred to multiple times in a document. What you probably
want are the mention types. Here I believe MedicationMention is the
preferred type going forward for a particular mention of a medication in
text (MedicationEventMention is the same thing but not preferred going

> 4.  You had explained that each sentence is parsed and is converted to a
> tree with head and terminal node . Is the typesystem of ctakes an feature
> of the node, i.e can one node belong to two more typesystems and their
> further attributes OR for each type system , there is a syntax tree for
> every sentence parsed. I mean a sentence has various trees attached to it
> but there is 1:1 mapping between the node and typesystem.
> Many Thanks
> Anirban
> On Thu, May 15, 2014 at 5:03 PM, Miller, Timothy <
> Timothy.Miller@childrens.harvard.edu> wrote:
>> Anir -- I'm not sure I understand your question but from your example it
>> doesn't sound like a tree exactly. If you just want a list of medication
>> concepts you can do something like:
>> List<MedicationMention> meds = new ArrayList<>(JCasUtil.select(jcas,
>> MedicationMention.class));
>> (I believe MedicationMention is the correct class but check your output.)
>> If you really do want to put them into a syntax tree, there are also
>> methods for doing that in AnnotationTreeUtils class.
>> getAnnotationTree(JCas, Annotation) will give you the tree for the whole
>> sentence containing the annotation you give it
>> annotationNode(JCas, Annotation) will give you the smallest subtree tree
>> covering the annotation you give it.
>> insertAnnotationNode(JCas, TopTreebankNode, Annotation, String) will
>> insert a node into the tree specified at the level specified by the
>> annotation with the category specified by the string. So for example if you
>> had meds as above you could then do:
>> for(MedicationMention med : meds){
>>   AnnotationTreeUtils.insertAnnotationNode(jcas,
>> AnnotationTreeUtils.getAnnotationTree(jcas, med), med, "MEDICATION")
>> }
>> which would insert a new node into every tree with the label "MEDICATION"
>> in every position where a medication was found.
>> One caveat to the above code is that these methods actually will change
>> the tree in the cas. That might be ok for some use cases but for many you
>> want to work on a tree outside the cas so that's why there is also methods:
>> getTreeCopy(JCas, TopTreebankNode)
>> getTreeCopy(JCas, TreebankNode)
>> if you use the getAnnotationTree method to obtain the tree you want, then
>> you can get a copy from these methods, then use the insert methods and do
>> something with them immediately (like print them out), without altering the
>> originals in the cas if other AEs may use them.
>> Tim
>> ________________________________________
>> From: Anirban Chakraborti [chakraborti.anirban@googlemail.com]
>> Sent: Sunday, May 11, 2014 9:15 AM
>> To: dev@ctakes.apache.org
>> Subject: Re: markable types
>> Steven,
>> Would you have any example code of tree parser so the output can be
>> arranged as per need. I mean, after successful annotation, I want to
>> extract certain concepts like medication only and arrange them in a new
>> tree so that all annotation in reference to medication concept and their
>> sources are listed together.
>> Anir
>> On Sun, May 11, 2014 at 3:55 PM, Steven Bethard <steven.bethard@gmail.com
>>> wrote:
>>> I don't think "not something anyone would want extracted" should be an
>>> argument against anything. We already have constituent and dependency
>>> parse trees in the type system, and those would fall under that
>>> category.
>>> So +1 on markables in the type system. (In general, +1 on moving
>>> module-specific types to the standard type system. I'm not sure what
>>> the real benefit of splitting them out is...)
>>> Steve
>>> On Fri, May 9, 2014 at 11:53 AM, Miller, Timothy
>>> <Timothy.Miller@childrens.harvard.edu> wrote:
>>>> What do people think about taking the "markable" types out of the
>>>> coreference project and adding them to the standard type system? This
>> is
>>>> a pretty standard concept in coreference that doesn't really have a
>>>> great natural representation in the current type system -- it
>>>> encompasses IdentifiedAnnotations as well as pronouns ("It", "him",
>>>> "her") and some determiners ("this").
>>>> The drawback I can see is that it is probably not something anyone
>> would
>>>> want extracted -- ultimately you want the actual coref pairs or chains.
>>>> But it is useful for things like representing gold standard input or
>>>> splitting coreference resolution into separate markable recognition and
>>>> relation classification steps.
>>>> Tim

Tim Miller
Boston Children's Hospital and Harvard Medical School

View raw message