ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <richard.eck...@gmail.com>
Subject Re: CTAKES-248- include original covered text of NEs which can't be recovered post if NE is from a disjoint span
Date Wed, 02 Oct 2013 07:17:01 GMT
What benefit would it have to store a string with some separation character (which may mean
that the separation character in the elements may need to be escaped), over using a feature
of type FSArray<Token> pointing to the original segments?

Not sure if that is what Karthik meant when referring to fetching the matched atom.

-- Richard

On 02.10.2013, at 01:46, Karthik Sarma <ksarma@ksarma.com> wrote:

> Hmm, couldn't you just fetch the matched atom and use that? Should be the
> same information (without, I suppose, the original ordering and split).
> 
> --
> Karthik Sarma
> UCLA Medical Scientist Training Program Class of 20??
> Member, UCLA Medical Imaging & Informatics Lab
> Member, CA Delegation to the House of Delegates of the American Medical
> Association
> ksarma@ksarma.com
> gchat: ksarma@gmail.com
> linkedin: www.linkedin.com/in/ksarma
> 
> 
> On Tue, Oct 1, 2013 at 12:37 PM, Masanz, James J. <Masanz.James@mayo.edu>wrote:
> 
>> Yes, this would help address that multiple permutations example.  The new
>> getOriginalText method would return something like "Acute|Disease".  Right
>> now I'm thinking of just using vertical bar as delimiter, to start with at
>> least, but think it should be configurable.
>> 
>> -----Original Message-----
>> From: dev-return-2067-Masanz.James=mayo.edu@ctakes.apache.org [mailto:
>> dev-return-2067-Masanz.James=mayo.edu@ctakes.apache.org] On Behalf Of
>> Chen, Pei
>> Sent: Tuesday, October 01, 2013 9:38 AM
>> To: dev@ctakes.apache.org
>> Subject: CTAKES-248- include original covered text of NEs which can't be
>> recovered post if NE is from a disjoint span
>> 
>> This sounds pretty cool.
>> James, will this address the multiple permutations lookup example:
>> "Acute alcoholic liver disease."  There is a cui: C0001314: Acute Disease,
>> but if you getCoveredText(), on the UMLSConcept, you would actually get the
>> same "Acute alcoholic liver disease" instead of "Acute Disease".
>> So, there is a new field called getOriginalText() that matched the hit?
>> 
>>> -----Original Message-----
>>> From: james-masanz@apache.org [mailto:james-masanz@apache.org]
>>> Sent: Monday, September 30, 2013 5:49 PM
>>> To: commits@ctakes.apache.org
>>> Subject: svn commit: r1527792 - /ctakes/trunk/ctakes-type-
>>> system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst
>>> em.xml
>>> 
>>> Author: james-masanz
>>> Date: Mon Sep 30 21:48:01 2013
>>> New Revision: 1527792
>>> 
>>> URL: http://svn.apache.org/r1527792
>>> Log:
>>> CTAKES-248  - for named entities, since the annotation just has the
>> begin and
>>> end offset, it is requested to have a way to get the original covered
>> text
>>> (especially for disjoint spans) so it is possible to know which words in
>> the
>>> covered text were actually used in the matching to the dictionary entry
>>> 
>>> Modified:
>>>    ctakes/trunk/ctakes-type-
>>> system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst
>>> em.xml
>>> 
>>> Modified: ctakes/trunk/ctakes-type-
>>> system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst
>>> em.xml
>>> URL: http://svn.apache.org/viewvc/ctakes/trunk/ctakes-type-
>>> system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSyst
>>> em.xml?rev=1527792&r1=1527791&r2=1527792&view=diff
>>> ==========================================================
>>> ====================
>>> Binary files - no diff available.


Mime
View raw message