ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From digital paula <cybersat...@hotmail.com>
Subject RE: cTakes: question on updating cue words
Date Wed, 15 Jan 2014 04:34:27 GMT
Hello again,  I've been busy with figuring out the issue with sectionizer and had forgot about
this...still haven't received a response.  :-(  I really hope Matt or anyone familiar with
cue word updates can help.
 
Developers/users should be able to  update cue words because it can be subjective as to what
category a cue word may fall in.  For example,  'predictive of'  could be a negation or uncertainty
depending on the task of taking current or future state into consideration.   There are several
cue terms in the cue term files that I'd like to change their assignment.   
 
Who determines what gets classified as negation or speculation and so forth?  Is it a committee?
 I really hope someone can explain more on the cue.model and what I stated in previous email.
 
 
Thanks. 
 
Regards,
Paula
 
From: cybersation@hotmail.com
To: user@ctakes.apache.org; dev@ctakes.apache.org
Subject: RE: cTakes: question on updating cue words
Date: Tue, 7 Jan 2014 13:36:33 -0500







Hi Matt,
 
I realized that I should have posted this on the developer site.
 
First of all, thanks for your followup a few weeks back.  I hadn't been subscribed to the
developer list prior to your post so I didn't see it until Pei mentioned it.
 
Not sure if you saw my response on user list but I wasn't able to get your suggestion to work
so I defaulted back to what I was trying to do with updating the medfacts snapshot jar.  After
stepping through the code I see that cTAKES was identifying the new cue term from the updated
medfacts snapshot jar (I had decompiled the jar, added the new term 'predictive of', then
added jar back to cTAKES).  Okay, so cTAKES did identify the new cue term but it's not getting
allocated to the 'possible' assertion type.   Since last night I've looked at several of the
java files in the medfacts jar and quickly realized that UIMA is not just the foundation of
cTAKES, MITRE is too!   
 
I would love to understand how does the cue word type(i.e. 'predictive of') get associated
to the assertion type(i.e. 'possible')?     I can't seem to figure that out by looking at
the code in the medfacts jar.    I'd like to understand how I can update it so my new cue
word gets recognized as a 'possible' assertion type.    The existing words in the speculation
are getting the correct 'possible' associate type.  Just the new cue term I added defaults
to 'present'.    This leads me to wonder if it's the cue.model that's doing the assignment
and has to be updated to recognize new cue word.   I'm hoping you can elucidate further on
what the cue.model file is and how it works.  It appears to be a binary of some type.   What
tools would be needed to update it?   Hmmm,  the fact that it's a .model extension leads me
to believe that it's the result of some hefty machine learning that's contained in that cue.model
file.  
 
Thanks so much.
 
Regards,
PaulaFrom: cybersation@hotmail.com
To: user@ctakes.apache.org
Subject: RE: cTakes: question on updating cue words
Date: Sun, 5 Jan 2014 21:38:55 -0500




Happy New Year cTAKES Community! Hopefully everyone's staying warm.
 
Okay, I did try Matt's suggestion from developer site
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3cCED4DCCB.126B0%25mcoarr@mitre.org%3e
 
 but unfortunately it didn't work so I just stepped through the code to see what's going on
with how cue words are being used in cTAKES.   I verified that the medfacts snapshot jar that
contains  around 30 txt files for cue words are all being called.   So before I tried Matt's
recent suggestion I did decompile the medfacts snapshot jar and add the new cue terms to the
jar then added it back to cTAKES.... thought it wasn't being recognized but it is.  I had
updated the speculation text file with 'predictive of' as a new cue term and while stepping
through I see that 'predictive of' was recognized as a cue term from the speculation file.
 
 
The problem is that it gets annotated as 'present' not as 'possible'.   That's why I thought
the updated cue term wasn't being recognized.  I did a quick test using one of the already
stated terms(I used 'improbable') from the speculation text file and sure enough the same
file that contains the new cue term of 'predictive of' got annotated as 'possible' which has
me wondering about the cue model and how it gets generated.  
 
So to echo what Tim stated, what does the cue model do?  What exactly is that file and how
can the contents be viewed and regenerated or updated?   Something clearly has to be updated
so 'predictive of' gets annotated as 'possible'.  
 
I'm so close to getting this resolved so I would appreciate any assistance.
 
Thanks.
 
Regards,
Paula
 
From: Timothy.Miller@childrens.harvard.edu
To: user@ctakes.apache.org
Subject: Re: cTakes: question on updating cue words
Date: Tue, 24 Dec 2013 14:19:46 +0000






Actually, I think Matt's suggestion is a bit out of date -- during development we removed
the dependency on the lucene dictionary lookup and now the under development version does
read those psv files directly.




But this still doesn't help Paula since she's trying to run the current release. I thought
Matt or Pei might have some info about whether its possible to modify negation cue words for
that release? For example, I can see in the code it uses a "cue model" which
 can be found in ctakes-assertion-res but it is a binary and I'm not sure what kind. Is there
any way to modify that file?



Tim





On 12/24/2013 12:09 AM, digital paula wrote:



Thank you, Pei.  I believe I had signed up for the dev list right after Matt posted so I didn't
see his email.  I will try it out. 

 

Merry Christmas to you and everyone on the list.  :-)

 

Regards,

Paula

 



Date: Mon, 23 Dec 2013 23:42:30 -0500

Subject: Re: cTakes: question on updating cue words

From: chenpei@apache.org

To: user@ctakes.apache.org



Paula,
Were you able to try Matt's suggestion on dev@?
http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3cCED4DCCB.126B0%25mcoarr@mitre.org%3e










On Mon, Dec 23, 2013 at 11:57 AM, digital paula 
<cybersation@hotmail.com> wrote:



Hello again cTAKES Community,

 

I think Tim's away for the holidays since I didn't see any  response.   Could someone else
assist?  To reiterate, I'd like to manually update the cue words for the polarity and uncertainty
features.     Please see below for details.

 

Thanks.

 

Regards,

Paula



 



From: 
cybersation@hotmail.com

To: 
user@ctakes.apache.org


Subject: RE: cTakes: question on updating cue words


Date: Thu, 19 Dec 2013 16:20:25 -0500





Hi Tim,  I just realized that my manual cue word updates didn't take.   :-(

 

I updated these two files from the med-facts.i2b21.2-SNAPSHOT.jar, then rebundled and added
back to cTAKES:

1.  negation_cue_list.txt

2.  certainty.txt

 

Is there another file that you know of that needs to be updated?    In the cue folder under
the jar contains only text files perhaps there's another text file I need to update, would
you know what the files would be or what other updates need to be made? 




 Thanks.

 

Regards,

Paula



Date: Mon, 16 Dec 2013 16:12:29 -0500

From: 
timothy.miller@childrens.harvard.edu

To: 
user@ctakes.apache.org

Subject: Re: cTakes: question on updating cue words



Paula, I think to use the released version of ctakes you will have to do what you proposed
- modify the jar. The checked in files (*.psv) that you are finding are for the under-development
version.

Tim





On 12/16/2013 03:27 PM, digital paula wrote:



Hi Pei,

 

I don't consider this a bug so not sure why a jira ticket is needed, I just need to add 2
cue words wondering if I can do it manually? 


 

Exploring a bit, I see there are .psv files in Assertion component in the  Cue_Words folder
that I updated but it doesn't seem to work.  I also added to the Semantic_Classes folder (in
Assertion as well), the cue words in the .txt file and that didn't work
 either.   

 

One thing that I haven't tried is updating the cue words in the org.mitre.medfacts.i2b2.cuefiles
package for the jar file:  medfacts-i2b2-1.2.SNAPSHPOT.jar.  That would be a little more work
since I need to extract and rebuild jar file and add back to project.


 

I'm kind of on a huge deadline and hoping I can make these changes today so hoping this doesn't
require a lot of time to just add a couple cue words.

 

By the way, was is a .psv file?  

 

Thanks.

 

Regards,

Paula

 



From: 
Pei.Chen@childrens.harvard.edu

To: 
dev@ctakes.apache.org

Subject: RE: cTakes: question on updating cue words

Date: Mon, 16 Dec 2013 19:31:45 +0000



[moved to dev@]

Hi Paula,

My suggestion would be to open a Jira item so
 that it could be tracked:

https://issues.apache.org/jira/browse/CTAKES (Feel free to create a new account).

Even cooler if you could attach the affected files with the patch(diffs) and any tests.

--Pei

 

 




From:
 digital paula [mailto:cybersation@hotmail.com]


Sent: Monday, December 16, 2013 1:30 PM

To: 
user@ctakes.apache.org

Subject: cTakes: question on updating cue words



 

Hello again cTAKES Community,

 

I would like to  add additional cue words to polarity (for negation) and uncertainty.    I
would so appreciate if someone can let me know how I can add additional cue words.   


 

Thanks.

 

Regards,

Paula 

























 		 	   		  
 		 	   		   		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message