ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Geise, Brandon D." <bdge...@geisinger.edu>
Subject RE: Fast Dictionary Update
Date Fri, 18 Sep 2015 02:22:51 GMT
You can disregard my question about the relation extraction as I fixed this by building the
new dictionary with the default data files in the dictionarytool.  I am curious about the
SNOMED change still though.

Thanks,
Brandon

-----Original Message-----
From: Geise, Brandon D. 
Sent: Thursday, September 17, 2015 9:40 PM
To: cTAKES Developer list <dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Thanks Dmitriy.  I was referring to the RelationsExtractor class found in the dictionarytool.
 On a similar note, the coding scheme for all SNOMEDCT codes for the new dictionary is CTAKES
compared to SNOMED with the UMLS version packaged with cTakes.  Is there something else I
need to run for the dictionary creation that I'm missing?

Thanks,
Brandon

-----Original Message-----
From: Dligach, Dmitriy [mailto:Dmitriy.Dligach@childrens.harvard.edu] 
Sent: Thursday, September 17, 2015 8:42 PM
To: cTAKES Developer list <dev@ctakes.apache.org>
Subject: Re: Fast Dictionary Update

Hi Brandon,

Relation extraction at the moment only handles two specific relation types: LocationOf and
DegreeOf. You are welcome to run it if you need these specific relations.


Dima

--
Dmitriy (Dima) Dligach, Ph.D.
Boston Children's Hospital and Harvard Medical School
(617) 651-0397



On Sep 17, 2015, at 17:08, Geise, Brandon D. <bdgeise@geisinger.edu<mailto:bdgeise@geisinger.edu>>
wrote:

Does the RelationsExtractor need to be run in order to generate information on relationships
from cTakes?  When running with 2011 UMLS dictionary I'm able to get relationships for BodyLocationMentions
but with the dictionary I created I am not getting this information.  Any advice?

Thanks,
Brandon

-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
Sent: Thursday, September 17, 2015 1:18 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

It claims that the database is connected and the preceding line of are spat out during loading,
which took ~3-4 seconds (so something was there):
............
17 Sep 2015 12:58:58  INFO JdbcConnectionFactory -  Database connected

Strange.  I don't really know what to tell you right now.  Perhaps something will click with
me later ...


Did you also run org.apache.ctakes.dictionarytool.CodeMapCreator ?  It isn't strictly necessary
but it stores the tuis in the database so that cTakes can identify the semantic group of a
mention.




-----Original Message-----
From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
Sent: Thursday, September 17, 2015 1:02 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Not specifically loaded.  Here's what I see when loading the pipeline:

17 Sep 2015 12:58:54  INFO JdbcConnectionFactory - Connecting to jdbc:hsqldb:file:path/to/ctakes/ctakes-dictionary-lookup-fast-res/src/main/resources/org/apache/ctakes/dictionary/lookup/fast/UMLS2015/snorx2015:
............
17 Sep 2015 12:58:58  INFO JdbcConnectionFactory -  Database connected

-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
Sent: Thursday, September 17, 2015 12:57 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Making an alternate copy of cTakesHsql.xml and pointing to the new dictionary is all that
is necessary.  Do you see a message in the initialization output indicating that the dictionary
db has been loaded?

-----Original Message-----
From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
Sent: Thursday, September 17, 2015 12:54 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Great, thanks both seemed to work for populating the script table.

Besides the path to the new dictionary needing to be changed in cTakesHsql.xml, does anything
else need to be modified to use the new dictionary?  My pipeline runs however there aren't
any annotations related to the UMLS concepts.  The only annotations I'm seeing are date, roman
numeral, or modifier related. (My pipeline if UMLSFastProcessor with additions for modifiers
and templatefiller).  Any suggestions would be appreciated.

Thanks,
Brandon

-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
Sent: Thursday, September 17, 2015 10:40 AM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Correct, Hsql should automatically read the .log file upon first use, and then perform the
inserts into the .script file.

In case you want to play it safe, check the README in the resource/ directory (where you got
the hsqldb template).  The last paragraph indicates how you can launch a simple sql tool to
play with the db.  You will need to change the name of the db accordingly.  Upon first launch
of the sql tool everything should be moved from the .log to the .script file.   It is a strange
setup/workflow, but it seems to work.

Sean

-----Original Message-----
From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
Sent: Thursday, September 17, 2015 10:31 AM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

When I run the tool it outputs a file with a .log extension that has all the insert statements.
 Do I copy this to the .script template from memcachedb in the dictionarytool project or should
the inserts be put into the .script file by default on the program execution?

Thanks,
Brandon

-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
Sent: Wednesday, September 16, 2015 9:59 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Excellent!

-----Original Message-----
From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
Sent: Wednesday, September 16, 2015 9:55 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

No, I had changed it on the Tiny source file.  I just changed the default file and it looks
to be running as expected now.

Thank you for all your help and patience, Brandon

-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
Sent: Wednesday, September 16, 2015 9:35 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Did you add it to data/default/ CtakesSources.txt ?

If not then you need to specify -src ./data/tiny/CtakesSources.txt

Sorry for any confusion.

As soon as my inet isn't overloaded I'll download 2015AA and see if I can build a dictionary.

-----Original Message-----
From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
Sent: Wednesday, September 16, 2015 8:14 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>; dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Sean,

I added that and still had the same issue.

Thanks,
Brandon
_____________________________
From: Finan, Sean <sean.finan@childrens.harvard.edu<mailto:sean.finan@childrens.harvard.edu><mailto:sean.finan@childrens.harvard.edu>>
Sent: Wednesday, September 16, 2015 7:56 PM
Subject: RE: Fast Dictionary Update
To: <dev@ctakes.apache.org<mailto:dev@ctakes.apache.org><mailto:dev@ctakes.apache.org>>


And you added "SNOMEDCT_US" to data/tiny/CtakesSources.txt ?

-----Original Message-----
From: Tomasz Oliwa [mailto:oliwa@uchicago.edu]
Sent: Wednesday, September 16, 2015 7:13 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org><mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

I have exactly the same problem with the tool.

A grep on MRCONSO.RRF for "SNOMEDCT" or for "SNOMEDCT_US" shows many lines.

________________________________________
From: Geise, Brandon D. [bdgeise@geisinger.edu<mailto:bdgeise@geisinger.edu><mailto:bdgeise@geisinger.edu>]
Sent: Wednesday, September 16, 2015 5:05 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org><mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Yes, it finds "SNOMEDCT_US".

-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
Sent: Wednesday, September 16, 2015 5:17 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org><mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Ah, now I see what you mean. Can you do a grep on your MRCONSO.RRF for "SNOMEDCT" ?

-----Original Message-----
From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
Sent: Wednesday, September 16, 2015 4:04 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org><mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

I tried changing as suggested.

Below is what I see for the snomed piece, but for RXNorm it writes terms at the end.

Reading list of Source Types from ./data/default/CtakesSources.txt File Lines 1 list of Source
Types 1 Reading list of Tuis from ./data/tiny/CtakesSnomedTuis.txt File Lines 24 list of Tuis
24 Compiling list of Cuis with wanted Tuis using /patto/UMLS_Current_Version/META/MRSTY.RRF
File Line 200000 Cuis 60895
File Line 300000 Cuis 85750
File Line 400000 Cuis 135098
File Line 600000 Cuis 183925
File Line 1700000<tel:1700000> Cuis 376338 File Line 1800000<tel:1800000> Cuis
471009 File Line 1900000<tel:1900000> Cuis 568375 File Line 2100000<tel:2100000>
Cuis 674715 File Line 2800000<tel:2800000> Cuis 903583 File Line 3300000<tel:3300000>
Cuis 973791 File Lines 3370173<tel:3370173> Cuis 999451 ..................................................File
Line 100000 Valid Cuis 0 ..................................................File Line 200000
Valid Cuis 0 ..................................................File Line 300000 Valid Cuis
0 ..................................................File Line 400000 Valid Cuis 0 ..................................................File
Line 500000 Valid Cuis 0 ..................................................File Line 600000
Valid Cuis 0 ..................................................File Line 700000 Valid Cuis
0 ..................................................File Line 800000 Valid Cuis 0 ..................................................File
Line 900000 Valid Cuis 0 ..................................................File Line 1000000<tel:1000000>
Valid Cuis 0 ..................................................File Line 1100000<tel:1100000>
Valid Cuis 0 ..................................................File Line 1200000<tel:1200000>
Valid Cuis 0 ..................................................File Line 1300000<tel:1300000>
Valid Cuis 0 ..................................................File Line 1400000<tel:1400000>
Valid Cuis 0 ..................................................File Line 1500000<tel:1500000>
Valid Cuis 0 ..................................................File Line 1600000<tel:1600000>
Valid Cuis 0 ..................................................File Line 1700000<tel:1700000>
Valid Cuis 0 ..................................................File Line 1800000<tel:1800000>
Valid Cuis 0 ..................................................File Line 1900000<tel:1900000>
Valid Cuis 0 ..................................................File Line 2000000<tel:2000000>
Valid Cuis 0 ..................................................File Line 2100000<tel:2100000>
Valid Cuis 0 ..................................................File Line 2200000<tel:2200000>
Valid Cuis 0 ..................................................File Line 2300000<tel:2300000>
Valid Cuis 0 ..................................................File Line 2400000<tel:2400000>
Valid Cuis 0 ..................................................File Line 2500000<tel:2500000>
Valid Cuis 0 ..................................................File Line 2600000<tel:2600000>
Valid Cuis 0 ..................................................File Line 2700000<tel:2700000>
Valid Cuis 0 ..................................................File Line 2800000<tel:2800000>
Valid Cuis 0 ..................................................File Line 2900000<tel:2900000>
Valid Cuis 0 ..................................................File Line 3000000<tel:3000000>
Valid Cuis 0 ..................................................File Line 3100000<tel:3100000>
Valid Cuis 0 ..................................................File Line 3200000<tel:3200000>
Valid Cuis 0 ..................................................File Line 3300000<tel:3300000>
Valid Cuis 0 ..................................................File Line 3400000<tel:3400000>
Valid Cuis 0 ..................................................File Line 3500000<tel:3500000>
Valid Cuis 0 ..................................................File Line 3600000<tel:3600000>
Valid Cuis 0 ..................................................File Line 3700000<tel:3700000>
Valid Cuis 0 ..................................................File Line 3800000<tel:3800000>
Valid Cuis 0 ..................................................File Line 3900000<tel:3900000>
Valid Cuis 0 ..................................................File Line 4000000<tel:4000000>
Valid Cuis 0 ..................................................File Line 4100000<tel:4100000>
Valid Cuis 0 ..................................................File Line 4200000<tel:4200000>
Valid Cuis 0 ..................................................File Line 4300000<tel:4300000>
Valid Cuis 0 ..................................................File Line 4400000<tel:4400000>
Valid Cuis 0 ..................................................File Line 4500000<tel:4500000>
Valid Cuis 0 ..................................................File Line 4600000<tel:4600000>
Valid Cuis 0 ..................................................File Line 4700000<tel:4700000>
Valid Cuis 0 ..................................................File Line 4800000<tel:4800000>
Valid Cuis 0 ..................................................File Line 4900000<tel:4900000>
Valid Cuis 0 ..................................................File Line 5000000<tel:5000000>
Valid Cuis 0 ..................................................File Line 5100000<tel:5100000>
Valid Cuis 0 ..................................................File Line 5200000<tel:5200000>
Valid Cuis 0 ..................................................File Line 5300000<tel:5300000>
Valid Cuis 0 ..................................................File Line 5400000<tel:5400000>
Valid Cuis 0 ..................................................File Line 5500000<tel:5500000>
Valid Cuis 0 ..................................................File Line 5600000<tel:5600000>
Valid Cuis 0 ..................................................File Line 5700000<tel:5700000>
Valid Cuis 0 ..................................................File Line 5800000<tel:5800000>
Valid Cuis 0 ..................................................File Line 5900000<tel:5900000>
Valid Cuis 0 ..................................................File Line 6000000<tel:6000000>
Valid Cuis 0 ..................................................File Line 6100000<tel:6100000>
Valid Cuis 0 ..................................................File Line 6200000<tel:6200000>
Valid Cuis 0 ..................................................File Line 6300000<tel:6300000>
Valid Cuis 0 ..................................................File Line 6400000<tel:6400000>
Valid Cuis 0 ..................................................File Line 6500000<tel:6500000>
Valid Cuis 0 ..................................................File Line 6600000<tel:6600000>
Valid Cuis 0 ..................................................File Line 6700000<tel:6700000>
Valid Cuis 0 ..................................................File Line 6800000<tel:6800000>
Valid Cuis 0 ..................................................File Line 6900000<tel:6900000>
Valid Cuis 0 ..................................................File Line 7000000<tel:7000000>
Valid Cuis 0 ..................................................File Line 7100000<tel:7100000>
Valid Cuis 0 ..................................................File Line 7200000<tel:7200000>
Valid Cuis 0 ..................................................File Line 7300000<tel:7300000>
Valid Cuis 0 ..................................................File Line 7400000<tel:7400000>
Valid Cuis 0 ..................................................File Line 7500000<tel:7500000>
Valid Cuis 0 ..................................................File Line 7600000<tel:7600000>
Valid Cuis 0 ..................................................File Line 7700000<tel:7700000>
Valid Cuis 0 ..................................................File Line 7800000<tel:7800000>
Valid Cuis 0 ..................................................File Line 7900000<tel:7900000>
Valid Cuis 0 ..................................................File Line 8000000<tel:8000000>
Valid Cuis 0 ..................................................File Line 8100000<tel:8100000>
Valid Cuis 0 ..................................................File Line 8200000<tel:8200000>
Valid Cuis 0 ..................................................File Line 8300000<tel:8300000>
Valid Cuis 0 ..................................................File Line 8400000<tel:8400000>
Valid Cuis 0 ..................................................File Line 8500000<tel:8500000>
Valid Cuis 0 ..................................................File Line 8600000<tel:8600000>
Valid Cuis 0 ..................................................File Line 8700000<tel:8700000>
Valid Cuis 0 ..................................................File Line 8800000<tel:8800000>
Valid Cuis 0 .............File Lines 8827152<tel:8827152> Valid Cuis 0 Compiling map
of Umls Cuis and Texts ..................................................File Line 100000
Terms 0 ..................................................File Line 200000 Terms 0 ..................................................File
Line 300000 Terms 0 ..................................................File Line 400000 Terms
0 ..................................................File Line 500000 Terms 0 ..................................................File
Line 600000 Terms 0 ..................................................File Line 700000 Terms
0 ..................................................File Line 800000 Terms 0 ..................................................File
Line 900000 Terms 0 ..................................................File Line 1000000<tel:1000000>
Terms 0 ..................................................File Line 1100000<tel:1100000>
Terms 0 ..................................................File Line 1200000<tel:1200000>
Terms 0 ..................................................File Line 1300000<tel:1300000>
Terms 0 ..................................................File Line 1400000<tel:1400000>
Terms 0 ..................................................File Line 1500000<tel:1500000>
Terms 0 ..................................................File Line 1600000<tel:1600000>
Terms 0 ..................................................File Line 1700000<tel:1700000>
Terms 0 ..................................................File Line 1800000<tel:1800000>
Terms 0 ..................................................File Line 1900000<tel:1900000>
Terms 0 ..................................................File Line 2000000<tel:2000000>
Terms 0 ..................................................File Line 2100000<tel:2100000>
Terms 0 ..................................................File Line 2200000<tel:2200000>
Terms 0 ..................................................File Line 2300000<tel:2300000>
Terms 0 ..................................................File Line 2400000<tel:2400000>
Terms 0 ..................................................File Line 2500000<tel:2500000>
Terms 0 ..................................................File Line 2600000<tel:2600000>
Terms 0 ..................................................File Line 2700000<tel:2700000>
Terms 0 ..................................................File Line 2800000<tel:2800000>
Terms 0 ..................................................File Line 2900000<tel:2900000>
Terms 0 ..................................................File Line 3000000<tel:3000000>
Terms 0 ..................................................File Line 3100000<tel:3100000>
Terms 0 ..................................................File Line 3200000<tel:3200000>
Terms 0 ..................................................File Line 3300000<tel:3300000>
Terms 0 ..................................................File Line 3400000<tel:3400000>
Terms 0 ..................................................File Line 3500000<tel:3500000>
Terms 0 ..................................................File Line 3600000<tel:3600000>
Terms 0 ..................................................File Line 3700000<tel:3700000>
Terms 0 ..................................................File Line 3800000<tel:3800000>
Terms 0 ..................................................File Line 3900000<tel:3900000>
Terms 0 ..................................................File Line 4000000<tel:4000000>
Terms 0 ..................................................File Line 4100000<tel:4100000>
Terms 0 ..................................................File Line 4200000<tel:4200000>
Terms 0 ..................................................File Line 4300000<tel:4300000>
Terms 0 ..................................................File Line 4400000<tel:4400000>
Terms 0 ..................................................File Line 4500000<tel:4500000>
Terms 0 ..................................................File Line 4600000<tel:4600000>
Terms 0 ..................................................File Line 4700000<tel:4700000>
Terms 0 ..................................................File Line 4800000<tel:4800000>
Terms 0 ..................................................File Line 4900000<tel:4900000>
Terms 0 ..................................................File Line 5000000<tel:5000000>
Terms 0 ..................................................File Line 5100000<tel:5100000>
Terms 0 ..................................................File Line 5200000<tel:5200000>
Terms 0 ..................................................File Line 5300000<tel:5300000>
Terms 0 ..................................................File Line 5400000<tel:5400000>
Terms 0 ..................................................File Line 5500000<tel:5500000>
Terms 0 ..................................................File Line 5600000<tel:5600000>
Terms 0 ..................................................File Line 5700000<tel:5700000>
Terms 0 ..................................................File Line 5800000<tel:5800000>
Terms 0 ..................................................File Line 5900000<tel:5900000>
Terms 0 ..................................................File Line 6000000<tel:6000000>
Terms 0 ..................................................File Line 6100000<tel:6100000>
Terms 0 ..................................................File Line 6200000<tel:6200000>
Terms 0 ..................................................File Line 6300000<tel:6300000>
Terms 0 ..................................................File Line 6400000<tel:6400000>
Terms 0 ..................................................File Line 6500000<tel:6500000>
Terms 0 ..................................................File Line 6600000<tel:6600000>
Terms 0 ..................................................File Line 6700000<tel:6700000>
Terms 0 ..................................................File Line 6800000<tel:6800000>
Terms 0 ..................................................File Line 6900000<tel:6900000>
Terms 0 ..................................................File Line 7000000<tel:7000000>
Terms 0 ..................................................File Line 7100000<tel:7100000>
Terms 0 ..................................................File Line 7200000<tel:7200000>
Terms 0 ..................................................File Line 7300000<tel:7300000>
Terms 0 ..................................................File Line 7400000<tel:7400000>
Terms 0 ..................................................File Line 7500000<tel:7500000>
Terms 0 ..................................................File Line 7600000<tel:7600000>
Terms 0 ..................................................File Line 7700000<tel:7700000>
Terms 0 ..................................................File Line 7800000<tel:7800000>
Terms 0 ..................................................File Line 7900000<tel:7900000>
Terms 0 ..................................................File Line 8000000<tel:8000000>
Terms 0 ..................................................File Line 8100000<tel:8100000>
Terms 0 ..................................................File Line 8200000<tel:8200000>
Terms 0 ..................................................File Line 8300000<tel:8300000>
Terms 0 ..................................................File Line 8400000<tel:8400000>
Terms 0 ..................................................File Line 8500000<tel:8500000>
Terms 0 ..................................................File Line 8600000<tel:8600000>
Terms 0 ..................................................File Line 8700000<tel:8700000>
Terms 0 ..................................................File Line 8800000<tel:8800000>
Terms 0 .............File Line 8827152<tel:8827152> Terms 0 Writing map of Cuis and
Texts to pathtoUmls2015.bsv

-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
Sent: Wednesday, September 16, 2015 4:00 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org><mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Thank you! I believe that was a change post 2011! You should actually be ok with both SNOMEDCT
and SNOMEDCT_US in CtakesSources.txt

Cheers,
Sean

-----Original Message-----
From: Maite Meseure Hugues [mailto:meseure.maite@gmail.com]
Sent: Wednesday, September 16, 2015 3:43 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org><mailto:dev@ctakes.apache.org>
Subject: Re: Fast Dictionary Update

If this can helps, I had to replace 'SNOMEDCT' with 'SNOMEDCT_US' in CtakesSources.txt.

On Wed, Sep 16, 2015 at 2:33 PM, Finan, Sean < Sean.Finan@childrens.harvard.edu<mailto:Sean.Finan@childrens.harvard.edu><mailto:Sean.Finan@childrens.harvard.edu>>
wrote:

I'm not sure that I understand your question. As I sent it, the anat, snomed and rxnorm are
not separate runs. The args line I sent earlier is for a single run that will create a dictionary
with snomed and rxnorm terms. The anatomy tui list has a special use in correctly processing
snomed codes.

-----Original Message-----
From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
Sent: Wednesday, September 16, 2015 3:27 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org><mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Ok, hopefully one last question.

Based on your example everything runs, however the Anat and Snomed runs don't produce any
valid CUIs but RXNorm does. I'm not sure if this has anything to do with it but every UMLS
source read is against MRSTY.

Here's my command

java -cp dictionarytool.jar;lib/*
org.apache.ctakes.dictionarytool.DictionaryCreator2 -umls /path/to/UMLS/META -fd ./data/tiny
-atui ./data/tiny/CtakesAnatTuis.txt -tui ./data/tiny/CtakesSnomedTuis.txt -ol path o ileUmls2015.bsv

Any suggestions?

Thanks again,
Brandon


-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
Sent: Wednesday, September 16, 2015 3:05 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org><mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Yes, that will make the rare word dictionary in a memory-based hsql database - the same as
the default for the dictionary-lookup-fast module.

-----Original Message-----
From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
Sent: Wednesday, September 16, 2015 2:42 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org><mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Thanks Sean, much appreciated. To clarify the example below would create the dictionary for
use for the rare word approach?

Thanks,
Brandon

-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
Sent: Wednesday, September 16, 2015 2:16 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org><mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Hi Brandon,

I just checked in a bin/dictionarytool.zip It should have everything that you need (.jar,
lib/, data/).
java -cp dictionarytool.jar;lib/*
org.apache.ctakes.dictionarytool.DictionaryCreator2 [args] Should do the trick.

To recreate a 2015 version of the current ctakes dictionary, the arguments
are:
-umls my/path/to/2015AA/META -fd ./data/tiny -atui ./data/tiny/CtakesAnatTuis.txt -tui ./data/tiny/CtakesSnomedTuis.txt
-db
jdbc:hsqldb:file:my/path/to/snorx2015 -tbl CUI_TERMS

Create my/path/to/snorx2015 by copying
resources/memdbtemplate/ctakesumls.properties to my/path/to/snorx2015.properties - there is
a resources/README about this.

Before populating a DB, I usually do a trial run first, writing to a flat file. Replace "-db
... -tbl ..." with "-ol my/path/to/testout.bsv"


Sean

-----Original Message-----
From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
Sent: Wednesday, September 16, 2015 1:49 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org><mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Hi Sean,

That'd be great.

I think I'm building it incorrectly because after I build the jar and try to run specifying
DictionaryCreator2 as the main class it says it can't find it. I'm not too familiar with Java
and building projects/jars so it could be my ignorance causing the problem.

Thanks,
Brandon

-----Original Message-----
From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
Sent: Wednesday, September 16, 2015 1:45 PM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org><mailto:dev@ctakes.apache.org>
Subject: RE: Fast Dictionary Update

Hi Brandon,

I can send you a jar or commit one pre-built. What goes wrong when you try to build the tool?

Sean

-----Original Message-----
From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]
Sent: Wednesday, September 16, 2015 1:23 PM
To: 'dev@ctakes.apache.org<mailto:dev@ctakes.apache.org><mailto:dev@ctakes.apache.org>'
Subject: Fast Dictionary Update

Does someone have the DictionaryTool jar available? I'm having trouble creating the jar file
from the project and would like to be able to create an updated UMLS fast dictionary for 2015.

Thanks,
Brandon


IMPORTANT WARNING: The information in this message (and the documents attached to it, if any)
is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this message by anyone else is unauthorized.
If you are not the intended recipient, any disclosure, copying, distribution or any action
taken, or omitted to be taken, in reliance on it is prohibited and may be unlawful. If you
have received this message in error, please delete all electronic copies of this message (and
the documents attached to it, if any), destroy any hard copies you may have created and notify
me immediately by replying to this email. Thank you.

Geisinger Health System utilizes an encryption process to safeguard Protected Health Information
and other confidential data contained in external e-mail messages. If email is encrypted,
the recipient will receive an e-mail instructing them to sign on to the Geisinger Health System
Secure E-mail Message Center to retrieve the encrypted e-mail.














Mime
View raw message