ctakes-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen, Pei" <Pei.C...@childrens.harvard.edu>
Subject RE: Regarding Entity Recognition
Date Wed, 24 Apr 2013 13:26:18 GMT
Hi Ravi,
In the attached LookupDesc_csv_sample.xml, It looks like it is still configured to use DirectLookupInitializerImpl...

<dictionaryRef idRef="DICT_CSV_SAMPLE" /> 
<lookupInitializer className="org.apache.ctakes.dictionary.lookup.ae.DirectLookupInitializerImpl">
</lookupInitializer>

Try:
<lookupInitializer className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">
<properties>
<property key="textMetaFields" value="0|1"/>
<property key="maxPermutationLevel" value="7"/>
<property key="windowAnnotations" value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>
                                                                   
</properties>
</lookupInitializer>
Instead.

> I also wanted to know if this is the only method to use a non-UMLS vocabulary as a dictionary
in cTAKES
It should be configurable to use CSV, Lucene, MySQL/HSQLDB, etc. where one can insert their
custom vocabulary, and they and also implement custom lookup algorithms if needed.
--Pei

From: ravi garg [mailto:ravigarg27@gmail.com] 
Sent: Wednesday, April 24, 2013 8:29 AM
To: user@ctakes.apache.org
Subject: Re: Regarding Entity Recognition

Hey Sorry for delayed reply.
I believe the changes you are suggesting, are to be made in LookupDesc_csv_sample.xml. I made
those changes but still didn't get the required results.
I am attaching the files here for reference. 
I also wanted to know if this is the only method to use a non-UMLS vocabulary as a dictionary
in cTAKES
Regards,
Ravi Garg 

On Tue, Apr 23, 2013 at 1:40 AM, Chen, Pei <Pei.Chen@childrens.harvard.edu> wrote:
Ravi, 
Could you please attach the DictionaryLookupAnnotarCSV.xml
In particular, please consider using the FirstTokenPermLookupInitializerImpl vs DirectLookup.
 
<lookupInitializer className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">
<properties>
<property key="textMetaFields" value="0|1"/>
<property key="maxPermutationLevel" value="7"/>
<property key="windowAnnotations" value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>                                                               
    
</properties>
</lookupInitializer>
 
I hope that helps.
 
From: ravi garg [mailto:ravigarg27@gmail.com] 
Sent: Monday, April 22, 2013 4:09 PM

To: user@ctakes.apache.org
Subject: Re: Regarding Entity Recognition
 
Sorry, But this too doesn't solve the problem
 
On Tue, Apr 23, 2013 at 1:28 AM, Savova, Guergana <Guergana.Savova@childrens.harvard.edu>
wrote:
Try adding in the dictionary:
Knee|knee pain|..
 
The first field is reserved for the first word of the phrase.
Regards,
--Guergana
 
From: ravi garg [mailto:ravigarg27@gmail.com] 
Sent: Monday, April 22, 2013 3:37 PM
To: user@ctakes.apache.org
Subject: Re: Regarding Entity Recognition
 
Hey,
Thanks for reply.
First let me brief you on what configuration I am using. I am using AggregatePlaintextProcessor.xml
with DictionaryLookupAnnotar being DictionaryLookupAnnotarCSV.xml which reads dictionary from
two files i.e one being the flat dictionary1.csv and another the lucene index one. I have
added knee pain as single term in dictionary1.csv (like knee pain| knee pain) but still I
am not being to get them as single entity. Am I missing something here?
Regards,
Ravi Garg
 
On Tue, Apr 23, 2013 at 12:49 AM, Chen, Pei <Pei.Chen@childrens.harvard.edu> wrote:
Hi Ravi,
Yes, in your example "knee pain", the default behavior in the dictionary lookup will create
3 IdentifiedAnnotations
"knee", "pain", as well as "knee pain".
 
[Assuming the terms exist in the UMLS dictionary]
--Pei
 
From: ravi garg [mailto:ravigarg27@gmail.com] 
Sent: Monday, April 22, 2013 3:06 PM
To: user@ctakes.apache.org
Subject: Regarding Entity Recognition
 
Hey,
First of all Congrats for building such a wonderful software. I am very new to cTAKES so had
a very basic question to ask. 
My query is Is it possible to identify multiple words as a single entity, for eg right now
knee pain gets identified as 'knee' and 'pain', but is it possible to get 'knee pain' as single
identity. If so what all changes I have to make to get going.



-- 
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus



-- 
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus



-- 
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus



-- 
Ravi Garg
3rd Year
MSc (hons) Biological Sciences
B.E (hons) Computer Science and Engineering
BITS Pilani KK Birla Goa Campus

Mime
View raw message