ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Nikandish <snika...@emerginghealthit.com>
Subject RE: Retrieving CUIs
Date Tue, 08 Jul 2014 19:11:27 GMT
Thanks James for answering my question.
 I am running "ClinicalPipelineWithUmls"  which is using "AggregatePlaintextUMLSProcessor".

The text that I am sending is 

"urine culture mos  source/body site clean catch culture results  >100,001col/ml escherichia
coli final id  e.coli amikacin s ampicillin r >=32". 

Where my own annotators kick in at the end of the pipeline and annotate the text. The only
element that I am missing is  CUI.
I ran the app with "pain" and "knee" and they got annotated with CUIs.  But for  the above
mentioned text I only got CUI for the medications. 
I looked at the codebase and in the UmlsToSnomedConsumerImpl, I printed out the CUIs and I
got all of them like this:

==>Source C1705919
===>Source C0449416
===>MOS C0072454
===>MOS C0026574
===>MOS C0072454
===>Urine culture C0430404
===>culture C0220814
===>Culture C1706355
===>Culture C2242979
===>Culture C0430400
===>Culture C0010453
===>CULTURE C1706355
===>culture C0010453
===>urine C0042037
===>URINE C0042036
===>urine C0042036
===>URINE C0042037
===>Urine C0042036
===>Site C0205145
===>Site C1515974
===>Site C2825164
===>Body C1551342
===>BODY C1268086
===>Body C1268086
===>Clean C1947930
===>culture C0220814
===>culture C0010453
===>Culture C0010453
===>Culture C2242979
===>CULTURE C1706355
===>Culture C0430400
===>Culture C1706355
===>Catch C0231617
===>Result C1274040
===>Results C1274040
===>Result C2825142
===>result C1274040
===>COL C0009367
===>COL C1704808
===>Escherichia coli C0014834
===>E C0439108
===>e C0439131
===>E C1553024
===>e C1551074
===>Final C0205088
===>ESCHERICHIA C0014833
===>Escherichia C0014833
===>Id C0020786
===>ID C0020787
===>ID C2986768
===>ID C2349049
===>ID C1522475
===>ID C0021247
===>ID C0600091
===>ML C0439242
===>mL C0439242
===>ML C1708949
===>ml C0439242
===>ML C0024581
===>ML C1706380
===>Ampicillin C0002680
===>AMPICILLIN C0002680
===>Ampicillin C0002680
===>AMPICILLIN C0002680
===>s C0457385
===>S C1551054
===>S C0439118
===>S C2825524
===>s C1704767
===>amikacin C0002499
===>Amikacin C0002499
===>AMIKACIN C0002499
===>AMIKACIN C0002499
===>amikacin C0002499
===>Amikacin C0002499

This is what I need. I can create a map and keep them there and use it but I need to make
some changes to the code and I was wondering if there is a way to avoid that?


-----Original Message-----
From: Masanz, James J. [mailto:Masanz.James@mayo.edu] 
Sent: Tuesday, July 08, 2014 2:54 PM
To: 'dev@ctakes.apache.org'
Subject: RE: Retrieving CUIs

What gets annotated depends on which pipeline you use.
The pipeline mentioned in the User Guide should also annotate diseases/disorders, signs/symptoms,
procedures, and anatomical sites. So first would be to check which pipeline you are using.

Or perhaps you do not have the separately downloadable dictionary. To test that, try text
that includes the words "knee" and "pain". If those are annotated and other anatomical sites
and signs/symptoms are not, you probably don't have the dictionary resources or don't have
them on your classpath.

Good things to post to help us help you are:
 - whether you are using CVD or CPE GUI, or a program/script
 - the name of pipeline you are running (are you loading an XML descriptor into CVD or CPE),
or the name of the script/program 
 - a sample output file (xcas/xmi) [make sure not to include any PHI in the sample text that
you process if you are going to post the output]

-- James

-----Original Message-----
From: Nick Nikandish [mailto:snikandi@emerginghealthit.com] 
Sent: Tuesday, July 08, 2014 1:43 PM
To: dev@ctakes.apache.org
Subject: Retrieving CUIs

Hi there,

I was wondering if there is any way to retrieve CUIs for the tokens of a free text in CTakes
without changing the codebase? I am able to retrieve CUIs for only medications  in a text.


View raw message