ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From samir chabou <samir...@yahoo.com>
Subject Re: CTAKES-248- include original covered text of NEs which can't be recovered post if NE is from a disjoint span
Date Wed, 23 Oct 2013 00:04:20 GMT
hi Pei,
is this mean that your proposition bellow is now ready to use. 
<< FYI: I was proposing adding an additional attribute to store the 
description/preferredText(term) [1] since this information is already 
available in the dictionary lookup.
I think most folks 
would find this useful in additional to just saving the CUI/Code. 
Otherwise, they would have to do another lookup further downstream to 
get the description of the CUI/Code.>> 





On Tuesday, October 22, 2013 4:01:35 PM, "Chen, Pei" <Pei.Chen@childrens.harvard.edu>
wrote:
 
Done.


> -----Original Message-----
> From: Masanz, James J. [mailto:Masanz.James@mayo.edu]
> Sent: Tuesday, October 22, 2013 2:33 PM
> To: 'dev@ctakes.apache.org'
> Subject: RE: CTAKES-248- include original covered text of NEs which can't be
> recovered post if NE is from a disjoint span
> 
> Sure, if you would, that would be great. Thanks.
> 
> -----Original Message-----
> From: dev-return-2128-Masanz.James=mayo.edu@ctakes.apache.org
> [mailto:dev-return-2128-Masanz.James=mayo.edu@ctakes.apache.org] On
> Behalf Of Chen, Pei
> Sent: Tuesday, October 22, 2013 1:30 PM
> To: dev@ctakes.apache.org
> Subject: RE: CTAKES-248- include original covered text of NEs which can't be
> recovered post if NE is from a disjoint span
> 
> James,
> I was making some changes to the ctakes common type system for CTAKES-
> 224 (Adding a field to save the UMLS term/text in addition to the
> CUI/Codes).
> Do you want me to also make originalText an FSArray<BaseToken> instead of
> String while I have these files open?
> 
> --Pei
> 
> > -----Original Message-----
> > From: Chen, Pei [mailto:Pei.Chen@childrens.harvard.edu]
> > Sent: Wednesday, October 02, 2013 10:23 AM
> > To: dev@ctakes.apache.org
> > Subject: RE: CTAKES-248- include original covered text of NEs which
> > can't be recovered post if NE is from a disjoint span
> >
> > +1 to have a pointer back to the BaseToken(s) rather than a | String
> > +(so we
> > could get back the spans and other info if needed).
> > I think the atom will be slightly different, take for example:
> > Perhaps with an example:
> > Sentence/LookupWindow: "alcoholic liver disease was acute."
> > originalText: "disease acute" [New feature to store the Tokens that
> > were matched due to the permutations?]
> > UmlsConcept.cui: C0001314
> > UmlsConcept.preferredText: "Acute Disease" [New feature to store the
> > atom/text returned by the UMLS CUI]
> >
> > I also ran into a similar case where I wish
> > IdentifiedAnnotation.segmentID/SentenceID was the actual Segment type
> > and not a String.
> >
> > This is just my 2 cents... open to ideas though.
> > --Pei
> >
> >
> > > -----Original Message-----
> > > From: Richard Eckart de Castilho [mailto:richard.eckart@gmail.com]
> > > Sent: Wednesday, October 02, 2013 3:19 AM
> > > To: dev@ctakes.apache.org
> > > Subject: Re: CTAKES-248- include original covered text of NEs which
> > > can't be recovered post if NE is from a disjoint span
> > >
> > > What benefit would it have to store a string with some separation
> > > character (which may mean that the separation character in the
> > > elements may need to be escaped), over using a feature of type
> > > FSArray<Token> pointing to the original segments?
> > >
> > > Not sure if that is what Karthik meant when referring to fetching
> > > the matched atom.
> > >
> > > -- Richard
> > >
> > > On 02.10.2013, at 01:46, Karthik Sarma <ksarma@ksarma.com> wrote:
> > >
> > > > Hmm, couldn't you just fetch the matched atom and use that? Should
> > > > be the same information (without, I suppose, the original ordering
> > > > and
> > split).
> > > >
> > > > --
> > > > Karthik Sarma
> > > > UCLA Medical Scientist Training Program Class of 20??
> > > > Member, UCLA Medical Imaging & Informatics Lab Member, CA
> > Delegation
> > > > to the House of Delegates of the American Medical Association
> > > > ksarma@ksarma.com
> > > > gchat: ksarma@gmail.com
> > > > linkedin: www.linkedin.com/in/ksarma
> > > >
> > > >
> > > > On Tue, Oct 1, 2013 at 12:37 PM, Masanz, James J.
> > > <Masanz.James@mayo.edu>wrote:
> > > >
> > > >> Yes, this would help address that multiple permutations example.
> > > >> The new getOriginalText method would return something like
> > > >> "Acute|Disease".  Right now I'm thinking of just using vertical
> > > >> bar as delimiter, to start with at least, but think it should be
> configurable.
> > > >>
> > > >> -----Original Message-----
> > > >> From: dev-return-2067-Masanz.James=mayo.edu@ctakes.apache.org
> > > [mailto:
> > > >> dev-return-2067-Masanz.James=mayo.edu@ctakes.apache.org] On
> > > Behalf Of
> > > >> Chen, Pei
> > > >> Sent: Tuesday, October 01, 2013 9:38 AM
> > > >> To: dev@ctakes.apache.org
> > > >> Subject: CTAKES-248- include original covered text of NEs which
> > > >> can't be recovered post if NE is from a disjoint span
> > > >>
> > > >> This sounds pretty cool.
> > > >> James, will this address the multiple permutations lookup example:
> > > >> "Acute alcoholic liver disease."  There is a cui: C0001314: Acute
> > > >> Disease, but if you getCoveredText(), on the UMLSConcept, you
> > > >> would actually get the same "Acute alcoholic liver disease"
> > > >> instead of "Acute
> > > Disease".
> > > >> So, there is a new field called getOriginalText() that matched the
hit?
> > > >>
> > > >>> -----Original Message-----
> > > >>> From: james-masanz@apache.org [mailto:james-
> masanz@apache.org]
> > > >>> Sent: Monday, September 30, 2013 5:49 PM
> > > >>> To: commits@ctakes.apache.org
> > > >>> Subject: svn commit: r1527792 - /ctakes/trunk/ctakes-type-
> > > >>>
> > >
> >
> system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSys
> > > >>> t
> > > >>> em.xml
> > > >>>
> > > >>> Author: james-masanz
> > > >>> Date: Mon Sep 30 21:48:01 2013
> > > >>> New Revision: 1527792
> > > >>>
> > > >>> URL: http://svn.apache.org/r1527792
> > > >>> Log:
> > > >>> CTAKES-248  - for named entities, since the annotation just has
> > > >>> the
> > > >> begin and
> > > >>> end offset, it is requested to have a way to get the original
> > > >>> covered
> > > >> text
> > > >>> (especially for disjoint spans) so it is possible to know which
> > > >>> words in
> > > >> the
> > > >>> covered text were actually used in the matching to the
> > > >>> dictionary entry
> > > >>>
> > > >>> Modified:
> > > >>>    ctakes/trunk/ctakes-type-
> > > >>>
> > >
> >
> system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSys
> > > >>> t
> > > >>> em.xml
> > > >>>
> > > >>> Modified: ctakes/trunk/ctakes-type-
> > > >>>
> > >
> >
> system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSys
> > > >>> t
> > > >>> em.xml
> > > >>> URL: http://svn.apache.org/viewvc/ctakes/trunk/ctakes-type-
> > > >>>
> > >
> >
> system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSys
> > > >>> t em.xml?rev=1527792&r1=1527791&r2=1527792&view=diff
> > > >>>
> > >
> >
> ==========================================================
> > > >>> ====================
> > > >>> Binary files - no diff available.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message