Mailing-List: contact dev-help@ctakes.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@ctakes.apache.org
Received-SPF: error (nike.apache.org: local policy)
From: "Miller, Timothy" <Timothy.Miller@childrens.harvard.edu>
To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
Subject: RE: markable types
Thread-Topic: markable types
Thread-Index: Ac9rnuNpQCrSpMJgQqmoegO29SQyJAImZXwA///Ex64=
Date: Tue, 20 May 2014 11:02:14 +0000
Message-ID: <E084D8EFE2B03A408B324458C5212E94244ACCC9@CHEXMBX3A.CHBOSTON.ORG>
References: <E084D8EFE2B03A408B324458C5212E942449E7EC@CHEXMBX3A.CHBOSTON.ORG>
	<CANuf76N_fDPa78OxnR9MAOgX8jp8AvKPR7hUOWJzyB0fSacO5Q@mail.gmail.com>
	<CAD_Tc=nGFHpeUv_dJwfjr+K0vQMGMqiCL_hq5Wcu3YbkPtRv+Q@mail.gmail.com>
	<E084D8EFE2B03A408B324458C5212E94244A45BA@CHEXMBX3A.CHBOSTON.ORG>
	<CAD_Tc==RC6RA3cPZi3h3GTaoZMZXKciRjW8wZ76AxpGzQ7hNsQ@mail.gmail.com>
	<E084D8EFE2B03A408B324458C5212E94244AC36D@CHEXMBX3A.CHBOSTON.ORG>,<CAD_Tc=kt6t91FtST7Ysay2N=0DKbFfU=4t8uyoysS8t4JkHwgg@mail.gmail.com>
In-Reply-To: 
 <CAD_Tc=kt6t91FtST7Ysay2N=0DKbFfU=4t8uyoysS8t4JkHwgg@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

I don't understand this question. Can you try to rephrase it? Or maybe if y=
ou tell me what you want to do that would help me understand.=0A=
=0A=
________________________________________=0A=
From: Anirban Chakraborti [chakraborti.anirban@googlemail.com]=0A=
Sent: Tuesday, May 20, 2014 6:34 AM=0A=
To: dev@ctakes.apache.org=0A=
Subject: Re: markable types=0A=
=0A=
thanks again Timothy=0A=
=0A=
final question for now=0A=
=0A=
You had explained that each sentence is parsed and is converted to a=0A=
> tree with head and terminal node . Is the typesystem of ctakes an feature=
=0A=
> of the node, i.e can one node belong to two more typesystems and their=0A=
> further attributes OR for each type system , there is a syntax tree for=
=0A=
> every sentence parsed. I mean a sentence has various trees attached to it=
=0A=
> but there is 1:1 mapping between the node and typesystem.=0A=
=0A=
Anir=0A=
=0A=
=0A=
On Tue, May 20, 2014 at 2:17 AM, Miller, Timothy <=0A=
Timothy.Miller@childrens.harvard.edu> wrote:=0A=
=0A=
>=0A=
> On 05/18/2014 07:40 AM, Anirban Chakraborti wrote:=0A=
> > Timothy,=0A=
> >=0A=
> > 1. so to get concepts of procedure, lab (if any), disease disorder , si=
gn=0A=
> > symptoms, Anatomical sites , I would need to do=0A=
> >=0A=
> > List<MedicationMention> meds =3D new ArrayList<>(JCasUtil.select(jcas,=
=0A=
> > MedicationMention.class) ;=0A=
> > List<DiseaseDisorderMention> disease =3D new=0A=
> > ArrayList<>(JCasUtil.select(jcas, DiseaseDisorderMention.class);=0A=
> > List<SignSymptomsMention> signs =3D new ArrayList<>(JCasUtil.select(jca=
s,=0A=
> > SignSymptomMention.class);=0A=
> > List<AnatomicalMention> anatomy =3D new ArrayList=0A=
> > <> (JacsUtil.select(jcas,AnatomicalMention.class);=0A=
> > List <LabMention> labs =3D new ArrayList <>=0A=
> > (JacsUtil.select(jcas,LabMention.class);=0A=
> >=0A=
> > then check the size of the array { meds, disease, signs, anatomy , labs=
}=0A=
> ,=0A=
> > print out the array or make a new array using the Java.utils.List or=0A=
> > Java.utils.Arraylist  package interfaces as the case might me.  Right .=
..=0A=
> yep=0A=
> > 2. I am more interested in the IdentifiedAnnotation class. However ther=
e=0A=
> > are concepts like FractionAnnotation which are not defined enum in the=
=0A=
> > const.java. How do I handle them. Do I need to add to the const.java=0A=
> file.=0A=
> nope, you probably just want EntityMention (for anatomical sites) and=0A=
> EventMention (for all clinical events, including DiseaseDisorder,=0A=
> Procedure, SignSymptom, etc.).=0A=
>=0A=
> >=0A=
> > 3. what exactly is the functional difference between say=0A=
> > MedicationEventMention .java, MedicationMention.java, Medication.java a=
nd=0A=
> > MedicationEventMention_type.java .  I understand similar difference is=
=0A=
> > between class of lab, procedure etc...=0A=
> The types ending in _type.java are UIMA-internal types, you can ignore.=
=0A=
> Medication is a referential type -- something in the real world that=0A=
> could be referred to multiple times in a document. What you probably=0A=
> want are the mention types. Here I believe MedicationMention is the=0A=
> preferred type going forward for a particular mention of a medication in=
=0A=
> text (MedicationEventMention is the same thing but not preferred going=0A=
> forward).=0A=
>=0A=
>=0A=
> >=0A=
> > 4.  You had explained that each sentence is parsed and is converted to =
a=0A=
> > tree with head and terminal node . Is the typesystem of ctakes an featu=
re=0A=
> > of the node, i.e can one node belong to two more typesystems and their=
=0A=
> > further attributes OR for each type system , there is a syntax tree for=
=0A=
> > every sentence parsed. I mean a sentence has various trees attached to =
it=0A=
> > but there is 1:1 mapping between the node and typesystem.=0A=
> >=0A=
> > Many Thanks=0A=
> >=0A=
> > Anirban=0A=
> >=0A=
> >=0A=
> >=0A=
> >=0A=
> >=0A=
> > On Thu, May 15, 2014 at 5:03 PM, Miller, Timothy <=0A=
> > Timothy.Miller@childrens.harvard.edu> wrote:=0A=
> >=0A=
> >> Anir -- I'm not sure I understand your question but from your example =
it=0A=
> >> doesn't sound like a tree exactly. If you just want a list of medicati=
on=0A=
> >> concepts you can do something like:=0A=
> >>=0A=
> >> List<MedicationMention> meds =3D new ArrayList<>(JCasUtil.select(jcas,=
=0A=
> >> MedicationMention.class));=0A=
> >> (I believe MedicationMention is the correct class but check your=0A=
> output.)=0A=
> >>=0A=
> >> If you really do want to put them into a syntax tree, there are also=
=0A=
> >> methods for doing that in AnnotationTreeUtils class.=0A=
> >>=0A=
> >> getAnnotationTree(JCas, Annotation) will give you the tree for the who=
le=0A=
> >> sentence containing the annotation you give it=0A=
> >> annotationNode(JCas, Annotation) will give you the smallest subtree tr=
ee=0A=
> >> covering the annotation you give it.=0A=
> >> insertAnnotationNode(JCas, TopTreebankNode, Annotation, String) will=
=0A=
> >> insert a node into the tree specified at the level specified by the=0A=
> >> annotation with the category specified by the string. So for example i=
f=0A=
> you=0A=
> >> had meds as above you could then do:=0A=
> >>=0A=
> >> for(MedicationMention med : meds){=0A=
> >>   AnnotationTreeUtils.insertAnnotationNode(jcas,=0A=
> >> AnnotationTreeUtils.getAnnotationTree(jcas, med), med, "MEDICATION")=
=0A=
> >> }=0A=
> >>=0A=
> >> which would insert a new node into every tree with the label=0A=
> "MEDICATION"=0A=
> >> in every position where a medication was found.=0A=
> >>=0A=
> >> One caveat to the above code is that these methods actually will chang=
e=0A=
> >> the tree in the cas. That might be ok for some use cases but for many=
=0A=
> you=0A=
> >> want to work on a tree outside the cas so that's why there is also=0A=
> methods:=0A=
> >> getTreeCopy(JCas, TopTreebankNode)=0A=
> >> getTreeCopy(JCas, TreebankNode)=0A=
> >>=0A=
> >> if you use the getAnnotationTree method to obtain the tree you want,=
=0A=
> then=0A=
> >> you can get a copy from these methods, then use the insert methods and=
=0A=
> do=0A=
> >> something with them immediately (like print them out), without alterin=
g=0A=
> the=0A=
> >> originals in the cas if other AEs may use them.=0A=
> >>=0A=
> >> Tim=0A=
> >>=0A=
> >>=0A=
> >>=0A=
> >> ________________________________________=0A=
> >> From: Anirban Chakraborti [chakraborti.anirban@googlemail.com]=0A=
> >> Sent: Sunday, May 11, 2014 9:15 AM=0A=
> >> To: dev@ctakes.apache.org=0A=
> >> Subject: Re: markable types=0A=
> >>=0A=
> >> Steven,=0A=
> >>=0A=
> >> Would you have any example code of tree parser so the output can be=0A=
> >> arranged as per need. I mean, after successful annotation, I want to=
=0A=
> >> extract certain concepts like medication only and arrange them in a ne=
w=0A=
> >> tree so that all annotation in reference to medication concept and the=
ir=0A=
> >> sources are listed together.=0A=
> >>=0A=
> >> Anir=0A=
> >>=0A=
> >>=0A=
> >> On Sun, May 11, 2014 at 3:55 PM, Steven Bethard <=0A=
> steven.bethard@gmail.com=0A=
> >>> wrote:=0A=
> >>> I don't think "not something anyone would want extracted" should be a=
n=0A=
> >>> argument against anything. We already have constituent and dependency=
=0A=
> >>> parse trees in the type system, and those would fall under that=0A=
> >>> category.=0A=
> >>>=0A=
> >>> So +1 on markables in the type system. (In general, +1 on moving=0A=
> >>> module-specific types to the standard type system. I'm not sure what=
=0A=
> >>> the real benefit of splitting them out is...)=0A=
> >>>=0A=
> >>> Steve=0A=
> >>>=0A=
> >>> On Fri, May 9, 2014 at 11:53 AM, Miller, Timothy=0A=
> >>> <Timothy.Miller@childrens.harvard.edu> wrote:=0A=
> >>>> What do people think about taking the "markable" types out of the=0A=
> >>>> coreference project and adding them to the standard type system? Thi=
s=0A=
> >> is=0A=
> >>>> a pretty standard concept in coreference that doesn't really have a=
=0A=
> >>>> great natural representation in the current type system -- it=0A=
> >>>> encompasses IdentifiedAnnotations as well as pronouns ("It", "him",=
=0A=
> >>>> "her") and some determiners ("this").=0A=
> >>>>=0A=
> >>>> The drawback I can see is that it is probably not something anyone=
=0A=
> >> would=0A=
> >>>> want extracted -- ultimately you want the actual coref pairs or=0A=
> chains.=0A=
> >>>> But it is useful for things like representing gold standard input or=
=0A=
> >>>> splitting coreference resolution into separate markable recognition=
=0A=
> and=0A=
> >>>> relation classification steps.=0A=
> >>>>=0A=
> >>>> Tim=0A=
> >>>>=0A=
>=0A=
> --=0A=
> Tim Miller=0A=
> Instructor=0A=
> Boston Children's Hospital and Harvard Medical School=0A=
> timothy.miller@childrens.harvard.edu=0A=
> 617-919-1223=0A=
>=0A=
>=0A=