ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anirban Chakraborti <chakraborti.anir...@googlemail.com>
Subject Re: markable types
Date Tue, 20 May 2014 11:24:18 GMT
Here it is

1. The Ctakes typesystem represents syntax trees with three types:
TopTreebankNode, TreebankNode, and TerminalTreebankNode - Understood.

2. The parser works at the sentence level, so a standard thing is to
simultaneously get all trees/sentences by doing:
for(TopTreebankNode tree : JCasUtil.select(jcas, TopTreebankNode.class)) -
Understood

My question is that a single word in a sentence may belong to various
types simultaneously. How does the associated typeclass get stored in the
nodes of tree so that when we parse the tree/sentence , we can get select
type of interest and associated features/attributes

what I want to understand what is the keys/value pairs of each node.

Basically so that the following code works

List<DiseaseDisorderMention> disease = new
> ArrayList<>(JCasUtil.select(jcas, DiseaseDisorderMention.class);  //
DiseaseDisorderMention is the selected typeclass to be extracted



Hope I am clearer this time

 Anir




On Tue, May 20, 2014 at 4:32 PM, Miller, Timothy <
Timothy.Miller@childrens.harvard.edu> wrote:

> I don't understand this question. Can you try to rephrase it? Or maybe if
> you tell me what you want to do that would help me understand.
>
> ________________________________________
> From: Anirban Chakraborti [chakraborti.anirban@googlemail.com]
> Sent: Tuesday, May 20, 2014 6:34 AM
> To: dev@ctakes.apache.org
> Subject: Re: markable types
>
> thanks again Timothy
>
> final question for now
>
> You had explained that each sentence is parsed and is converted to a
> > tree with head and terminal node . Is the typesystem of ctakes an feature
> > of the node, i.e can one node belong to two more typesystems and their
> > further attributes OR for each type system , there is a syntax tree for
> > every sentence parsed. I mean a sentence has various trees attached to it
> > but there is 1:1 mapping between the node and typesystem.
>
> Anir
>
>
> On Tue, May 20, 2014 at 2:17 AM, Miller, Timothy <
> Timothy.Miller@childrens.harvard.edu> wrote:
>
> >
> > On 05/18/2014 07:40 AM, Anirban Chakraborti wrote:
> > > Timothy,
> > >
> > > 1. so to get concepts of procedure, lab (if any), disease disorder ,
> sign
> > > symptoms, Anatomical sites , I would need to do
> > >
> > > List<MedicationMention> meds = new ArrayList<>(JCasUtil.select(jcas,
> > > MedicationMention.class) ;
> > > List<DiseaseDisorderMention> disease = new
> > > ArrayList<>(JCasUtil.select(jcas, DiseaseDisorderMention.class);
> > > List<SignSymptomsMention> signs = new ArrayList<>(JCasUtil.select(jcas,
> > > SignSymptomMention.class);
> > > List<AnatomicalMention> anatomy = new ArrayList
> > > <> (JacsUtil.select(jcas,AnatomicalMention.class);
> > > List <LabMention> labs = new ArrayList <>
> > > (JacsUtil.select(jcas,LabMention.class);
> > >
> > > then check the size of the array { meds, disease, signs, anatomy ,
> labs}
> > ,
> > > print out the array or make a new array using the Java.utils.List or
> > > Java.utils.Arraylist  package interfaces as the case might me.  Right
> ...
> > yep
> > > 2. I am more interested in the IdentifiedAnnotation class. However
> there
> > > are concepts like FractionAnnotation which are not defined enum in the
> > > const.java. How do I handle them. Do I need to add to the const.java
> > file.
> > nope, you probably just want EntityMention (for anatomical sites) and
> > EventMention (for all clinical events, including DiseaseDisorder,
> > Procedure, SignSymptom, etc.).
> >
> > >
> > > 3. what exactly is the functional difference between say
> > > MedicationEventMention .java, MedicationMention.java, Medication.java
> and
> > > MedicationEventMention_type.java .  I understand similar difference is
> > > between class of lab, procedure etc...
> > The types ending in _type.java are UIMA-internal types, you can ignore.
> > Medication is a referential type -- something in the real world that
> > could be referred to multiple times in a document. What you probably
> > want are the mention types. Here I believe MedicationMention is the
> > preferred type going forward for a particular mention of a medication in
> > text (MedicationEventMention is the same thing but not preferred going
> > forward).
> >
> >
> > >
> > > 4.  You had explained that each sentence is parsed and is converted to
> a
> > > tree with head and terminal node . Is the typesystem of ctakes an
> feature
> > > of the node, i.e can one node belong to two more typesystems and their
> > > further attributes OR for each type system , there is a syntax tree for
> > > every sentence parsed. I mean a sentence has various trees attached to
> it
> > > but there is 1:1 mapping between the node and typesystem.
> > >
> > > Many Thanks
> > >
> > > Anirban
> > >
> > >
> > >
> > >
> > >
> > > On Thu, May 15, 2014 at 5:03 PM, Miller, Timothy <
> > > Timothy.Miller@childrens.harvard.edu> wrote:
> > >
> > >> Anir -- I'm not sure I understand your question but from your example
> it
> > >> doesn't sound like a tree exactly. If you just want a list of
> medication
> > >> concepts you can do something like:
> > >>
> > >> List<MedicationMention> meds = new ArrayList<>(JCasUtil.select(jcas,
> > >> MedicationMention.class));
> > >> (I believe MedicationMention is the correct class but check your
> > output.)
> > >>
> > >> If you really do want to put them into a syntax tree, there are also
> > >> methods for doing that in AnnotationTreeUtils class.
> > >>
> > >> getAnnotationTree(JCas, Annotation) will give you the tree for the
> whole
> > >> sentence containing the annotation you give it
> > >> annotationNode(JCas, Annotation) will give you the smallest subtree
> tree
> > >> covering the annotation you give it.
> > >> insertAnnotationNode(JCas, TopTreebankNode, Annotation, String) will
> > >> insert a node into the tree specified at the level specified by the
> > >> annotation with the category specified by the string. So for example
> if
> > you
> > >> had meds as above you could then do:
> > >>
> > >> for(MedicationMention med : meds){
> > >>   AnnotationTreeUtils.insertAnnotationNode(jcas,
> > >> AnnotationTreeUtils.getAnnotationTree(jcas, med), med, "MEDICATION")
> > >> }
> > >>
> > >> which would insert a new node into every tree with the label
> > "MEDICATION"
> > >> in every position where a medication was found.
> > >>
> > >> One caveat to the above code is that these methods actually will
> change
> > >> the tree in the cas. That might be ok for some use cases but for many
> > you
> > >> want to work on a tree outside the cas so that's why there is also
> > methods:
> > >> getTreeCopy(JCas, TopTreebankNode)
> > >> getTreeCopy(JCas, TreebankNode)
> > >>
> > >> if you use the getAnnotationTree method to obtain the tree you want,
> > then
> > >> you can get a copy from these methods, then use the insert methods and
> > do
> > >> something with them immediately (like print them out), without
> altering
> > the
> > >> originals in the cas if other AEs may use them.
> > >>
> > >> Tim
> > >>
> > >>
> > >>
> > >> ________________________________________
> > >> From: Anirban Chakraborti [chakraborti.anirban@googlemail.com]
> > >> Sent: Sunday, May 11, 2014 9:15 AM
> > >> To: dev@ctakes.apache.org
> > >> Subject: Re: markable types
> > >>
> > >> Steven,
> > >>
> > >> Would you have any example code of tree parser so the output can be
> > >> arranged as per need. I mean, after successful annotation, I want to
> > >> extract certain concepts like medication only and arrange them in a
> new
> > >> tree so that all annotation in reference to medication concept and
> their
> > >> sources are listed together.
> > >>
> > >> Anir
> > >>
> > >>
> > >> On Sun, May 11, 2014 at 3:55 PM, Steven Bethard <
> > steven.bethard@gmail.com
> > >>> wrote:
> > >>> I don't think "not something anyone would want extracted" should be
> an
> > >>> argument against anything. We already have constituent and dependency
> > >>> parse trees in the type system, and those would fall under that
> > >>> category.
> > >>>
> > >>> So +1 on markables in the type system. (In general, +1 on moving
> > >>> module-specific types to the standard type system. I'm not sure what
> > >>> the real benefit of splitting them out is...)
> > >>>
> > >>> Steve
> > >>>
> > >>> On Fri, May 9, 2014 at 11:53 AM, Miller, Timothy
> > >>> <Timothy.Miller@childrens.harvard.edu> wrote:
> > >>>> What do people think about taking the "markable" types out of the
> > >>>> coreference project and adding them to the standard type system?
> This
> > >> is
> > >>>> a pretty standard concept in coreference that doesn't really have
a
> > >>>> great natural representation in the current type system -- it
> > >>>> encompasses IdentifiedAnnotations as well as pronouns ("It", "him",
> > >>>> "her") and some determiners ("this").
> > >>>>
> > >>>> The drawback I can see is that it is probably not something anyone
> > >> would
> > >>>> want extracted -- ultimately you want the actual coref pairs or
> > chains.
> > >>>> But it is useful for things like representing gold standard input
or
> > >>>> splitting coreference resolution into separate markable recognition
> > and
> > >>>> relation classification steps.
> > >>>>
> > >>>> Tim
> > >>>>
> >
> > --
> > Tim Miller
> > Instructor
> > Boston Children's Hospital and Harvard Medical School
> > timothy.miller@childrens.harvard.edu
> > 617-919-1223
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message