ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Geise, Brandon D." <bdge...@geisinger.edu>
Subject RE: Fast Dictionary Update
Date Thu, 17 Sep 2015 00:13:45 GMT
Sean,

I added that and still had the same issue.

Thanks,
Brandon
_____________________________
From: Finan, Sean <sean.finan@childrens.harvard.edu<mailto:sean.finan@childrens.harvard.edu>>
Sent: Wednesday, September 16, 2015 7:56 PM
Subject: RE: Fast Dictionary Update
To: <dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>>


And you added "SNOMEDCT_US" to data/tiny/CtakesSources.txt ?

-----Original Message-----
From: Tomasz Oliwa [mailto:oliwa@uchicago.edu]
Sent: Wednesday, September 16, 2015 7:13 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

I have exactly the same problem with the tool.

A grep on MRCONSO.RRF for "SNOMEDCT" or for "SNOMEDCT_US" shows many lines.

________________________________________
From: Geise, Brandon D. [bdgeise@geisinger.edu<mailto:bdgeise@geisinger.edu>]
Sent: Wednesday, September 16, 2015 5:05 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Yes, it finds "SNOMEDCT_US".

-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
Sent: Wednesday, September 16, 2015 5:17 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Ah, now I see what you mean. Can you do a grep on your MRCONSO.RRF for "SNOMEDCT" ?

-----Original Message-----
From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
Sent: Wednesday, September 16, 2015 4:04 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

I tried changing as suggested.

Below is what I see for the snomed piece, but for RXNorm it writes terms at the end.

Reading list of Source Types from ./data/default/CtakesSources.txt
File Lines 1 list of Source Types 1
Reading list of Tuis from ./data/tiny/CtakesSnomedTuis.txt
File Lines 24 list of Tuis 24
Compiling list of Cuis with wanted Tuis using /patto/UMLS_Current_Version/META/MRSTY.RRF
File Line 200000 Cuis 60895
File Line 300000 Cuis 85750
File Line 400000 Cuis 135098
File Line 600000 Cuis 183925
File Line 1700000<tel:1700000> Cuis 376338
File Line 1800000<tel:1800000> Cuis 471009
File Line 1900000<tel:1900000> Cuis 568375
File Line 2100000<tel:2100000> Cuis 674715
File Line 2800000<tel:2800000> Cuis 903583
File Line 3300000<tel:3300000> Cuis 973791
File Lines 3370173<tel:3370173> Cuis 999451
..................................................File Line 100000 Valid Cuis 0
..................................................File Line 200000 Valid Cuis 0
..................................................File Line 300000 Valid Cuis 0
..................................................File Line 400000 Valid Cuis 0
..................................................File Line 500000 Valid Cuis 0
..................................................File Line 600000 Valid Cuis 0
..................................................File Line 700000 Valid Cuis 0
..................................................File Line 800000 Valid Cuis 0
..................................................File Line 900000 Valid Cuis 0
..................................................File Line 1000000<tel:1000000> Valid
Cuis 0
..................................................File Line 1100000<tel:1100000> Valid
Cuis 0
..................................................File Line 1200000<tel:1200000> Valid
Cuis 0
..................................................File Line 1300000<tel:1300000> Valid
Cuis 0
..................................................File Line 1400000<tel:1400000> Valid
Cuis 0
..................................................File Line 1500000<tel:1500000> Valid
Cuis 0
..................................................File Line 1600000<tel:1600000> Valid
Cuis 0
..................................................File Line 1700000<tel:1700000> Valid
Cuis 0
..................................................File Line 1800000<tel:1800000> Valid
Cuis 0
..................................................File Line 1900000<tel:1900000> Valid
Cuis 0
..................................................File Line 2000000<tel:2000000> Valid
Cuis 0
..................................................File Line 2100000<tel:2100000> Valid
Cuis 0
..................................................File Line 2200000<tel:2200000> Valid
Cuis 0
..................................................File Line 2300000<tel:2300000> Valid
Cuis 0
..................................................File Line 2400000<tel:2400000> Valid
Cuis 0
..................................................File Line 2500000<tel:2500000> Valid
Cuis 0
..................................................File Line 2600000<tel:2600000> Valid
Cuis 0
..................................................File Line 2700000<tel:2700000> Valid
Cuis 0
..................................................File Line 2800000<tel:2800000> Valid
Cuis 0
..................................................File Line 2900000<tel:2900000> Valid
Cuis 0
..................................................File Line 3000000<tel:3000000> Valid
Cuis 0
..................................................File Line 3100000<tel:3100000> Valid
Cuis 0
..................................................File Line 3200000<tel:3200000> Valid
Cuis 0
..................................................File Line 3300000<tel:3300000> Valid
Cuis 0
..................................................File Line 3400000<tel:3400000> Valid
Cuis 0
..................................................File Line 3500000<tel:3500000> Valid
Cuis 0
..................................................File Line 3600000<tel:3600000> Valid
Cuis 0
..................................................File Line 3700000<tel:3700000> Valid
Cuis 0
..................................................File Line 3800000<tel:3800000> Valid
Cuis 0
..................................................File Line 3900000<tel:3900000> Valid
Cuis 0
..................................................File Line 4000000<tel:4000000> Valid
Cuis 0
..................................................File Line 4100000<tel:4100000> Valid
Cuis 0
..................................................File Line 4200000<tel:4200000> Valid
Cuis 0
..................................................File Line 4300000<tel:4300000> Valid
Cuis 0
..................................................File Line 4400000<tel:4400000> Valid
Cuis 0
..................................................File Line 4500000<tel:4500000> Valid
Cuis 0
..................................................File Line 4600000<tel:4600000> Valid
Cuis 0
..................................................File Line 4700000<tel:4700000> Valid
Cuis 0
..................................................File Line 4800000<tel:4800000> Valid
Cuis 0
..................................................File Line 4900000<tel:4900000> Valid
Cuis 0
..................................................File Line 5000000<tel:5000000> Valid
Cuis 0
..................................................File Line 5100000<tel:5100000> Valid
Cuis 0
..................................................File Line 5200000<tel:5200000> Valid
Cuis 0
..................................................File Line 5300000<tel:5300000> Valid
Cuis 0
..................................................File Line 5400000<tel:5400000> Valid
Cuis 0
..................................................File Line 5500000<tel:5500000> Valid
Cuis 0
..................................................File Line 5600000<tel:5600000> Valid
Cuis 0
..................................................File Line 5700000<tel:5700000> Valid
Cuis 0
..................................................File Line 5800000<tel:5800000> Valid
Cuis 0
..................................................File Line 5900000<tel:5900000> Valid
Cuis 0
..................................................File Line 6000000<tel:6000000> Valid
Cuis 0
..................................................File Line 6100000<tel:6100000> Valid
Cuis 0
..................................................File Line 6200000<tel:6200000> Valid
Cuis 0
..................................................File Line 6300000<tel:6300000> Valid
Cuis 0
..................................................File Line 6400000<tel:6400000> Valid
Cuis 0
..................................................File Line 6500000<tel:6500000> Valid
Cuis 0
..................................................File Line 6600000<tel:6600000> Valid
Cuis 0
..................................................File Line 6700000<tel:6700000> Valid
Cuis 0
..................................................File Line 6800000<tel:6800000> Valid
Cuis 0
..................................................File Line 6900000<tel:6900000> Valid
Cuis 0
..................................................File Line 7000000<tel:7000000> Valid
Cuis 0
..................................................File Line 7100000<tel:7100000> Valid
Cuis 0
..................................................File Line 7200000<tel:7200000> Valid
Cuis 0
..................................................File Line 7300000<tel:7300000> Valid
Cuis 0
..................................................File Line 7400000<tel:7400000> Valid
Cuis 0
..................................................File Line 7500000<tel:7500000> Valid
Cuis 0
..................................................File Line 7600000<tel:7600000> Valid
Cuis 0
..................................................File Line 7700000<tel:7700000> Valid
Cuis 0
..................................................File Line 7800000<tel:7800000> Valid
Cuis 0
..................................................File Line 7900000<tel:7900000> Valid
Cuis 0
..................................................File Line 8000000<tel:8000000> Valid
Cuis 0
..................................................File Line 8100000<tel:8100000> Valid
Cuis 0
..................................................File Line 8200000<tel:8200000> Valid
Cuis 0
..................................................File Line 8300000<tel:8300000> Valid
Cuis 0
..................................................File Line 8400000<tel:8400000> Valid
Cuis 0
..................................................File Line 8500000<tel:8500000> Valid
Cuis 0
..................................................File Line 8600000<tel:8600000> Valid
Cuis 0
..................................................File Line 8700000<tel:8700000> Valid
Cuis 0
..................................................File Line 8800000<tel:8800000> Valid
Cuis 0
.............File Lines 8827152<tel:8827152> Valid Cuis 0
Compiling map of Umls Cuis and Texts
..................................................File Line 100000 Terms 0
..................................................File Line 200000 Terms 0
..................................................File Line 300000 Terms 0
..................................................File Line 400000 Terms 0
..................................................File Line 500000 Terms 0
..................................................File Line 600000 Terms 0
..................................................File Line 700000 Terms 0
..................................................File Line 800000 Terms 0
..................................................File Line 900000 Terms 0
..................................................File Line 1000000<tel:1000000> Terms
0
..................................................File Line 1100000<tel:1100000> Terms
0
..................................................File Line 1200000<tel:1200000> Terms
0
..................................................File Line 1300000<tel:1300000> Terms
0
..................................................File Line 1400000<tel:1400000> Terms
0
..................................................File Line 1500000<tel:1500000> Terms
0
..................................................File Line 1600000<tel:1600000> Terms
0
..................................................File Line 1700000<tel:1700000> Terms
0
..................................................File Line 1800000<tel:1800000> Terms
0
..................................................File Line 1900000<tel:1900000> Terms
0
..................................................File Line 2000000<tel:2000000> Terms
0
..................................................File Line 2100000<tel:2100000> Terms
0
..................................................File Line 2200000<tel:2200000> Terms
0
..................................................File Line 2300000<tel:2300000> Terms
0
..................................................File Line 2400000<tel:2400000> Terms
0
..................................................File Line 2500000<tel:2500000> Terms
0
..................................................File Line 2600000<tel:2600000> Terms
0
..................................................File Line 2700000<tel:2700000> Terms
0
..................................................File Line 2800000<tel:2800000> Terms
0
..................................................File Line 2900000<tel:2900000> Terms
0
..................................................File Line 3000000<tel:3000000> Terms
0
..................................................File Line 3100000<tel:3100000> Terms
0
..................................................File Line 3200000<tel:3200000> Terms
0
..................................................File Line 3300000<tel:3300000> Terms
0
..................................................File Line 3400000<tel:3400000> Terms
0
..................................................File Line 3500000<tel:3500000> Terms
0
..................................................File Line 3600000<tel:3600000> Terms
0
..................................................File Line 3700000<tel:3700000> Terms
0
..................................................File Line 3800000<tel:3800000> Terms
0
..................................................File Line 3900000<tel:3900000> Terms
0
..................................................File Line 4000000<tel:4000000> Terms
0
..................................................File Line 4100000<tel:4100000> Terms
0
..................................................File Line 4200000<tel:4200000> Terms
0
..................................................File Line 4300000<tel:4300000> Terms
0
..................................................File Line 4400000<tel:4400000> Terms
0
..................................................File Line 4500000<tel:4500000> Terms
0
..................................................File Line 4600000<tel:4600000> Terms
0
..................................................File Line 4700000<tel:4700000> Terms
0
..................................................File Line 4800000<tel:4800000> Terms
0
..................................................File Line 4900000<tel:4900000> Terms
0
..................................................File Line 5000000<tel:5000000> Terms
0
..................................................File Line 5100000<tel:5100000> Terms
0
..................................................File Line 5200000<tel:5200000> Terms
0
..................................................File Line 5300000<tel:5300000> Terms
0
..................................................File Line 5400000<tel:5400000> Terms
0
..................................................File Line 5500000<tel:5500000> Terms
0
..................................................File Line 5600000<tel:5600000> Terms
0
..................................................File Line 5700000<tel:5700000> Terms
0
..................................................File Line 5800000<tel:5800000> Terms
0
..................................................File Line 5900000<tel:5900000> Terms
0
..................................................File Line 6000000<tel:6000000> Terms
0
..................................................File Line 6100000<tel:6100000> Terms
0
..................................................File Line 6200000<tel:6200000> Terms
0
..................................................File Line 6300000<tel:6300000> Terms
0
..................................................File Line 6400000<tel:6400000> Terms
0
..................................................File Line 6500000<tel:6500000> Terms
0
..................................................File Line 6600000<tel:6600000> Terms
0
..................................................File Line 6700000<tel:6700000> Terms
0
..................................................File Line 6800000<tel:6800000> Terms
0
..................................................File Line 6900000<tel:6900000> Terms
0
..................................................File Line 7000000<tel:7000000> Terms
0
..................................................File Line 7100000<tel:7100000> Terms
0
..................................................File Line 7200000<tel:7200000> Terms
0
..................................................File Line 7300000<tel:7300000> Terms
0
..................................................File Line 7400000<tel:7400000> Terms
0
..................................................File Line 7500000<tel:7500000> Terms
0
..................................................File Line 7600000<tel:7600000> Terms
0
..................................................File Line 7700000<tel:7700000> Terms
0
..................................................File Line 7800000<tel:7800000> Terms
0
..................................................File Line 7900000<tel:7900000> Terms
0
..................................................File Line 8000000<tel:8000000> Terms
0
..................................................File Line 8100000<tel:8100000> Terms
0
..................................................File Line 8200000<tel:8200000> Terms
0
..................................................File Line 8300000<tel:8300000> Terms
0
..................................................File Line 8400000<tel:8400000> Terms
0
..................................................File Line 8500000<tel:8500000> Terms
0
..................................................File Line 8600000<tel:8600000> Terms
0
..................................................File Line 8700000<tel:8700000> Terms
0
..................................................File Line 8800000<tel:8800000> Terms
0
.............File Line 8827152<tel:8827152> Terms 0
Writing map of Cuis and Texts to pathtoUmls2015.bsv

-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
Sent: Wednesday, September 16, 2015 4:00 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Thank you! I believe that was a change post 2011! You should actually be ok with both SNOMEDCT
and SNOMEDCT_US in CtakesSources.txt

Cheers,
Sean

-----Original Message-----
From: Maite Meseure Hugues [mailto:meseure.maite@gmail.com]
Sent: Wednesday, September 16, 2015 3:43 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: Re: Fast Dictionary Update

If this can helps, I had to replace 'SNOMEDCT' with 'SNOMEDCT_US' in CtakesSources.txt.

On Wed, Sep 16, 2015 at 2:33 PM, Finan, Sean < Sean.Finan@childrens.harvard.edu<mailto:Sean.Finan@childrens.harvard.edu>>
wrote:

> I'm not sure that I understand your question. As I sent it, the anat,
> snomed and rxnorm are not separate runs. The args line I sent earlier
> is for a single run that will create a dictionary with snomed and
> rxnorm terms. The anatomy tui list has a special use in correctly
> processing snomed codes.
>
> -----Original Message-----
> From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
> Sent: Wednesday, September 16, 2015 3:27 PM
> To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
> Subject: RE: Fast Dictionary Update
>
> Ok, hopefully one last question.
>
> Based on your example everything runs, however the Anat and Snomed
> runs don't produce any valid CUIs but RXNorm does. I'm not sure if
> this has anything to do with it but every UMLS source read is against MRSTY.
>
> Here's my command
>
> java -cp dictionarytool.jar;lib/*
> org.apache.ctakes.dictionarytool.DictionaryCreator2 -umls
> /path/to/UMLS/META -fd ./data/tiny -atui
> ./data/tiny/CtakesAnatTuis.txt -tui ./data/tiny/CtakesSnomedTuis.txt
> -ol path o ileUmls2015.bsv
>
> Any suggestions?
>
> Thanks again,
> Brandon
>
>
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Wednesday, September 16, 2015 3:05 PM
> To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
> Subject: RE: Fast Dictionary Update
>
> Yes, that will make the rare word dictionary in a memory-based hsql
> database - the same as the default for the dictionary-lookup-fast module.
>
> -----Original Message-----
> From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
> Sent: Wednesday, September 16, 2015 2:42 PM
> To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
> Subject: RE: Fast Dictionary Update
>
> Thanks Sean, much appreciated. To clarify the example below would
> create the dictionary for use for the rare word approach?
>
> Thanks,
> Brandon
>
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Wednesday, September 16, 2015 2:16 PM
> To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
> Subject: RE: Fast Dictionary Update
>
> Hi Brandon,
>
> I just checked in a bin/dictionarytool.zip It should have everything
> that you need (.jar, lib/, data/).
> java -cp dictionarytool.jar;lib/*
> org.apache.ctakes.dictionarytool.DictionaryCreator2 [args] Should do
> the trick.
>
> To recreate a 2015 version of the current ctakes dictionary, the
> arguments
> are:
> -umls my/path/to/2015AA/META -fd ./data/tiny -atui
> ./data/tiny/CtakesAnatTuis.txt -tui ./data/tiny/CtakesSnomedTuis.txt
> -db
> jdbc:hsqldb:file:my/path/to/snorx2015 -tbl CUI_TERMS
>
> Create my/path/to/snorx2015 by copying
> resources/memdbtemplate/ctakesumls.properties to
> my/path/to/snorx2015.properties - there is a resources/README about this.
>
> Before populating a DB, I usually do a trial run first, writing to a
> flat file. Replace "-db ... -tbl ..." with "-ol my/path/to/testout.bsv"
>
>
> Sean
>
> -----Original Message-----
> From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
> Sent: Wednesday, September 16, 2015 1:49 PM
> To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
> Subject: RE: Fast Dictionary Update
>
> Hi Sean,
>
> That'd be great.
>
> I think I'm building it incorrectly because after I build the jar and
> try to run specifying DictionaryCreator2 as the main class it says it
> can't find it. I'm not too familiar with Java and building
> projects/jars so it could be my ignorance causing the problem.
>
> Thanks,
> Brandon
>
> -----Original Message-----
> From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> Sent: Wednesday, September 16, 2015 1:45 PM
> To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
> Subject: RE: Fast Dictionary Update
>
> Hi Brandon,
>
> I can send you a jar or commit one pre-built. What goes wrong when
> you try to build the tool?
>
> Sean
>
> -----Original Message-----
> From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
> Sent: Wednesday, September 16, 2015 1:23 PM
> To: 'dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>'
> Subject: Fast Dictionary Update
>
> Does someone have the DictionaryTool jar available? I'm having
> trouble creating the jar file from the project and would like to be
> able to create an updated UMLS fast dictionary for 2015.
>
> Thanks,
> Brandon
>
>
> IMPORTANT WARNING: The information in this message (and the documents
> attached to it, if any) is confidential and may be legally privileged.
> It is intended solely for the addressee. Access to this message by
> anyone else is unauthorized. If you are not the intended recipient,
> any disclosure, copying, distribution or any action taken, or omitted
> to be taken, in reliance on it is prohibited and may be unlawful. If
> you have received this message in error, please delete all electronic
> copies of this message (and the documents attached to it, if any),
> destroy any hard copies you may have created and notify me immediately by replying to
this email. Thank you.
>
> Geisinger Health System utilizes an encryption process to safeguard
> Protected Health Information and other confidential data contained in
> external e-mail messages. If email is encrypted, the recipient will
> receive an e-mail instructing them to sign on to the Geisinger Health
> System Secure E-mail Message Center to retrieve the encrypted e-mail.
>
>
>
>



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message