Return-Path: X-Original-To: apmail-ctakes-dev-archive@www.apache.org Delivered-To: apmail-ctakes-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 91444184B7 for ; Thu, 10 Dec 2015 07:14:15 +0000 (UTC) Received: (qmail 66917 invoked by uid 500); 10 Dec 2015 07:14:15 -0000 Delivered-To: apmail-ctakes-dev-archive@ctakes.apache.org Received: (qmail 66858 invoked by uid 500); 10 Dec 2015 07:14:15 -0000 Mailing-List: contact dev-help@ctakes.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ctakes.apache.org Delivered-To: mailing list dev@ctakes.apache.org Received: (qmail 66839 invoked by uid 99); 10 Dec 2015 07:14:15 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Dec 2015 07:14:15 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 874F3C0FFD for ; Thu, 10 Dec 2015 07:14:14 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.9 X-Spam-Level: ** X-Spam-Status: No, score=2.9 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id vzqFRV7o6znS for ; Thu, 10 Dec 2015 07:14:03 +0000 (UTC) Received: from mail-lf0-f43.google.com (mail-lf0-f43.google.com [209.85.215.43]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 82E0B25E93 for ; Thu, 10 Dec 2015 07:13:59 +0000 (UTC) Received: by lffu14 with SMTP id u14so50531807lff.1 for ; Wed, 09 Dec 2015 23:13:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=Z3lIIJClfeXXUTaxMGuA/jzNPo3omYqwuudjfzbzMNk=; b=SAvxGIsuXQT0LmyGtss7dbrpAr3d7esDYbAXbwsBAR5Bj7Q2Xuy3aF9F9U24M+paPi 3S3snFUQIx+vwkBDwpWlLWZEAF9ge0lKWQUHsfRfhWOplW0BithotONU9D239c8hsjK1 hcFaOqdETuaa+4JkIqVh804+KGawzvKcFlQnNE4D8zqj+vwiu4mcRW+TiWN1ZbnDngFa QXrxaR8fm43DIDWBlK6tVue2apXP+qn9KxrIZyu2pUILHg9oBnjPBLwoJsTFJWAqh2L3 y4k3/CYlJKkC4EIxPYnH8qbF+R1CzZpCJ/coBwLWR066JSt+c+xEs7O/3+2adOgL0qDs 4WKw== MIME-Version: 1.0 X-Received: by 10.25.152.133 with SMTP id a127mr4321556lfe.152.1449731638848; Wed, 09 Dec 2015 23:13:58 -0800 (PST) Received: by 10.25.22.42 with HTTP; Wed, 9 Dec 2015 23:13:58 -0800 (PST) Received: by 10.25.22.42 with HTTP; Wed, 9 Dec 2015 23:13:58 -0800 (PST) In-Reply-To: References: <7d4a6b946e954659a53a92b0aebb267c@LOFEXMBX207W12V.geisinger.edu> Date: Thu, 10 Dec 2015 09:13:58 +0200 Message-ID: Subject: RE: ctakes with icd10; 2015 versions available on sourceforge! From: Alaa al Barari To: dev@ctakes.apache.org Content-Type: multipart/alternative; boundary=001a114020ac5a7195052685f359 --001a114020ac5a7195052685f359 Content-Type: text/plain; charset=UTF-8 Thank, but what I endup with is wrong ? On Dec 10, 2015 4:26 AM, "Finan, Sean" wrote: > Hi Alaa, > > If you downloaded the 2015 .property and .script files then you do not > need to run the dictionary creation tool. Those databases are already > populated and ready to use. > > Sean > > > -----Original Message----- > From: Alaa al Barari [mailto:alaa.albarari@gmail.com] > Sent: Wednesday, December 09, 2015 6:33 PM > To: dev@ctakes.apache.org > Subject: Re: ctakes with icd10; 2015 versions available on sourceforge! > > so basically looks like the path had Desktop as capital thats why it did > not work. > > I ended up having rows like this inside ctakesicd2015.scripts : > > INSERT INTO CUI_TERMS VALUES(2723481,8,15,'magnesium sulfate 1000 mg / 50 > ml - nacl 0 . 9 % intravenous solution','nacl') INSERT INTO CUI_TERMS > VALUES(2723481,9,16,'magnesium sulfate , 2 g / 100 ml > - nacl 0 . 9 % intravenous solution','nacl') INSERT INTO CUI_TERMS > VALUES(2723481,0,7,'magnesium sulfate 20 mg / ml > injection','magnesium') > > > does this mean it worked ? > > > > > > On Thu, Dec 10, 2015 at 1:07 AM, Alaa al Barari > wrote: > > > Thanks Finan and Brandon, your help is appreciated a lot. > > > > I downloaded the dictionary tool from > > https://urldefense.proofpoint.com/v2/url?u=https-3A__svn.apache.org_re > > pos_asf_ctakes_sandbox_dictionarytool_bin_dictionarytool.zip&d=BQIBaQ& > > c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYm > > QCP6r0bcpKGd4f7d4gTao&m=uJq_3OpLiUaBOz9vqxKBI-gUAtLhJMme9uKXqroHhMM&s= > > JVOlLM08gTn5rV2T3R_bqeZT8XbMDgLhfKg8Fo5mAQw&e= > > I hope its the latest and bug free. > > > > > > my running command is : java -cp ./dictionarytool.jar:lib/* > > org.apache.ctakes.dictionarytool.DictionaryCreator2 -umls > > /home/abarari/Desktop/umls/2015AB/META/ -atui > > ./data/optional/CtakesAnatTuis.txt -db > > jdbc:hsqldb:file:/home/abarari/Desktop/dictionarytool/output/ctakesicd > > 2015 -tbl CUI_TERMS -df ./data/optional/ -src > > ./data/small/ConversionSources.txt > > -tui ./data/optional/CtakesAllTuis.txt > > > > > > > > I am running on ubuntu by the way ... anyway under > > /home/abarari/Desktop/dictionarytool/output/ > > > > there is only > > > > abarari@ubuntu:~/Desktop/dictionarytool/output$ ls ctakesicd2015.log > > ctakesicd2015.properties ctakesicd2015.script > > > > > > where is the database ? am I doing something wrong ? do I need to > > create the database before executing the dictionarytool or what ? > > > > > > I found couple of issues in the dictionary tool, it does not work well > > with relative paths. > > > > > > On Wed, Dec 9, 2015 at 7:11 AM, Pei Chen wrote: > > > >> Brandon, > >> That sounds great! > >> Please open a Jira ticket for any contributions (anyone should be > >> able to create a Jira account). There are some legal items built > >> into the ASF Jira attachments for accepting contributions/donations. > >> It will also credit the contributors with the merit appropriately. > >> Anyone who is interested can follow the Jira item. (Even better if > >> contributions were open discussion/open development.) --Pei > >> > >> On Tue, Dec 8, 2015 at 10:36 PM, Geise, Brandon D. > >> wrote: > >> > I'd be interested in contributing to making the dictionary tool > >> > more > >> user friendly with a GUI. > >> > > >> > Thanks, > >> > Brandon > >> > > >> > -----Original Message----- > >> > From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu] > >> > Sent: Tuesday, December 08, 2015 6:12 PM > >> > To: dev@ctakes.apache.org > >> > Subject: RE: ctakes with icd10; 2015 versions available on > sourceforge! > >> > > >> > Hi Dave, > >> > > >> > I'm always happy to see interest in our stuff! > >> > > >> >>Step 1 > >> > I built the tool to be able to build a dictionary using anything in > >> > the > >> umls - snomed, icd9, hpo, etc. so using the veterinary extension > >> shouldn't be a problem. You just add it to the CtakesSources file > >> (or create an alternate file and point to it with -src). To answer > >> another of your questions, there can be zero or more sources - you > >> saw snomedct and snomedct_us (each valid in a different umls version). > >> > It also can include any semantic type, just add (or remove) the > >> appropriate tuis in a different data file. > >> > > >> >>Step 2 > >> > You have it right - you copy the templates to another location and > >> output to that location. Otherwise you 'lose' your templates. > >> > > >> >>Step 3 and 4 > >> > The jar is built from source. I need to (soon) check in updates to > >> > the > >> source, and at the same time I can check in a default prebuilt .jar > >> The lib/ directory is in the source repository. > >> > > >> > Various people have toyed with the idea of putting the tool into a > >> ctakes module, putting it into an "installation package", making a gui > ... > >> The best option (imo) is probably to make an easy to use gui and keep > >> a pre-built version in sandbox. Someday, after the rainbow, maybe > >> I'll get a chance to do that ... > >> > > >> > Sean > >> > > >> > > >> > -----Original Message----- > >> > From: David Kincaid [mailto:kincaid.dave@gmail.com] > >> > Sent: Tuesday, December 08, 2015 4:57 PM > >> > To: dev@ctakes.apache.org > >> > Subject: Re: ctakes with icd10; 2015 versions available on > sourceforge! > >> > > >> > Thanks, Sean! It's great that cTAKES may soon have an up to date > >> database out of the box. Hopefully it will cut down on the need for > >> many to build their own DB's. Thank you much for doing that. > >> > > >> > Unfortunately, I still will need to build a custom one for us. I > >> > work > >> in veterinary medicine so I need to add in the veterinary extension > >> for SNOMED-CT into the database. > >> > > >> > I looked over the steps below that Brandon included and have some > >> questions: > >> > > >> > step 1 says to "Change /data/default/CtakesSources.txt from "SNOMEDCT" > >> to "SNOMEDCT_US". The file that I have has two lines in it. First > >> line is SNOMED, second line is SNOMEDCT_US. So this step doesn't really > make sense. > >> > > >> > step 2 should reference the two scripts as being in > >> resource/memdbtemplate so others don't have to search for them. Not > >> sure what it means to move them to "location to put new UMLS DB". > >> Does that mean move them into a new directory where the newly created > >> UMLS DB will get written? > >> > > >> > steps 3 and 4 for running the tools reference dictionarytool.jar > >> > which > >> doesn't exist. Does one need to build that somehow from the source > >> before running it? The command line also adds "lib/*" to the > >> classpath. Is that the lib directory inside the dictionarytool source > >> code or some other location? > >> > > >> > What else would I need to do to include the SNOMED-CT Veterinary > >> Extension along with the snomedct and rxnorm sources? > >> > > >> > I'll probably not have time to try this out for a while yet, but > >> > when I > >> do I'd be happy to write up an easy to follow tutorial for building a > >> custom dictionary assuming I am able to get it to work. > >> > > >> > Has anyone considered making this tool available outside of the > >> > source > >> code itself? Like including it in the main cTAKES release? It seems > >> there is demand for it. > >> > > >> > - Dave > >> > > >> > On Tue, Dec 8, 2015 at 3:22 PM, Finan, Sean < > >> Sean.Finan@childrens.harvard.edu> wrote: > >> > > >> >> Hi Brandon, thanks for finding and forwarding the instructions! > >> >> > >> >> I have checked in two new hsqldb dictionaries, both from the > >> >> 2015AB version of the UMLS. They both have codes for snomedct_us, > >> >> rxnorm, icd9cm and icd10pcs - as well as the usual cui, tui, > >> >> preferred term > >> mappings. > >> >> > >> >> One uses cuis filtered by snomed and rxnorm, the other adds cuis > >> >> filtered by icd9 and icd10. > >> >> What this means: Cuis that exist for a [filter source] are added > >> >> to the dictionary, as are all text variations from all sources > >> >> that contain that cui. Both dictionaries also use the standard > >> >> ctakes semantic group tui filters. > >> >> > >> >> The names are ctakessnorx2015 and ctakesicd2015 > >> >> > >> >> The snomed rxnorm : > >> >> > >> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__sourceforge.ne > >> >> t_p_ > >> >> ctakesresources_code_HEAD_tree_trunk_ctakes-2Dresources-2Dsnomed-2 > >> >> Drwo > >> >> rd-2Dhsqldb-2D2011ab_src_main_resources_org_apache_ctakes_dictiona > >> >> ry_l > >> >> ookup_fast_ctakessnorx2015_&d=BQIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW1 > >> >> 4JZM > >> >> SdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=SRqws > >> >> l3Fm > >> >> uUXq77GmVlfXn0lE0pVRkL53DNhukcaW6c&s=kWCcj3-hcqYWZXIPhsERggDLCO-5g > >> >> ppCR > >> >> oS1Gav7r2A&e= > >> >> > >> >> The snomed rxnorm icd9 icd10: > >> >> > >> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__sourceforge.ne > >> >> t_p_ > >> >> ctakesresources_code_HEAD_tree_trunk_ctakes-2Dresources-2Dsnomed-2 > >> >> Drwo > >> >> rd-2Dhsqldb-2D2011ab_src_main_resources_org_apache_ctakes_dictiona > >> >> ry_l > >> >> ookup_fast_ctakesicd2015_&d=BQIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14J > >> >> ZMSd > >> >> ioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=SRqwsl3 > >> >> FmuU > >> >> Xq77GmVlfXn0lE0pVRkL53DNhukcaW6c&s=RZ--ZQ2qvGnhm4h2Vvz1oU97qA8BG2G > >> >> 39Tw > >> >> w7EdYgKA&e= > >> >> > >> >> The svn root for the whole ugly thing is: > >> >> svn checkout svn://svn.code.sf.net/p/ctakesresources/code/trunk > >> >> > >> >> Stats: > >> >> ctakessnorx2015 > >> >> 545,913 Terms > >> >> 229,251 Concepts (Cuis) > >> >> 272,987 Snomed codes > >> >> 32,419 Rxnorm codes > >> >> 11,321 icd9 codes > >> >> 61 icd10 codes > >> >> > >> >> Ctakesicd2015 > >> >> 611,230 Terms > >> >> 282,211 Concepts > >> >> 18,626 icd9 codes > >> >> 45,818 icd10 codes > >> >> Snomed and Rxnorm counts are the same > >> >> > >> >> So, adding the icd filters gave us an extra ~53,000 concepts and > >> >> ~65,000 terms. > >> >> > >> >> I would like to move this all to a better root (not > >> >> ctakes-resources-snomed-rword-hsqldb-2011ab) but I wasn't able to > >> >> write directly in trunk (??) and need to get moving on to other > things. > >> >> > >> >> There is help on the ctakes wiki: > >> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache. > >> >> org_ > >> >> confluence_display_CTAKES_cTAKES-2B3.2-2B-2D-2BFast-2BDictionary-2 > >> >> BLoo > >> >> kup&d=BQIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67G > >> >> vlGZ > >> >> stTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=SRqwsl3FmuUXq77GmVlfXn0lE0pVR > >> >> kL53 DNhukcaW6c&s=98W_vAHGZ2FLEMPfrSgEHtZt-mQ3XJjF6yQYM26tqP4&e= > >> >> Though I should probably add a few items ... > >> >> > >> >> > >> >> Sean > >> >> > >> >> > >> >> -----Original Message----- > >> >> From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu] > >> >> Sent: Tuesday, December 08, 2015 12:51 PM > >> >> To: dev@ctakes.apache.org > >> >> Subject: RE: ctakes with icd10 > >> >> > >> >> Not to perpetuate the instructions again but I sent these out not > >> >> long ago when I was going through the process and Sean was helping > me. > >> >> > >> >> 1. Change /data/default/CtakesSources.txt from "SNOMEDCT" > >> >> to "SNOMEDCT_US" > >> >> 2. Copy ctakesumls.properties and ctakesumls.script from > >> >> memdbtemplate to location to put new UMLS DB > >> >> 3. Run DictionaryCreator2 > >> >> java -cp dictionarytool.jar;lib/* > >> >> org.apache.ctakes.dictionarytool.DictionaryCreator2 -umls > >> >> "\pathToUmls\META" -atui ./data/tiny/CtakesAnatTuis.txt -db > >> >> jdbc:hsqldb:file:pathTonewDB\snorx2015 -tbl CUI_TERMS > >> >> 4. Run CodeMapCreator > >> >> java -cp dictionarytool.jar;lib/* > >> >> org.apache.ctakes.dictionarytool.CodeMapCreator -umls > >> "\pathToUmls\META" > >> >> -atui ./data/tiny/CtakesAnatTuis.txt -db > >> >> jdbc:hsqldb:file:pathTonewDB\snorx2015 -tbl CUI_TERMS > >> >> 5. Copy new DB files to new location and create a copy of > >> >> cTakesHsql.xml and update dictionary location > >> >> > >> >> Thanks, > >> >> Brandon > >> >> > >> >> -----Original Message----- > >> >> From: David Kincaid [mailto:kincaid.dave@gmail.com] > >> >> Sent: Tuesday, December 08, 2015 12:47 PM > >> >> To: dev@ctakes.apache.org > >> >> Subject: Re: ctakes with icd10 > >> >> > >> >> This seems like a pretty common request and with such an old > >> >> version of UMLS database shipped with cTAKES it's only going to get > worse. > >> >> I've been wanting to build a dictionary using the latest UMLS > >> >> release (as well as a custom database), so would be happy to write > >> >> up the steps as I go through it. That assumes that I can dig up > >> >> the > >> instructions in the dev list. > >> >> > >> >> - Dave > >> >> > >> >> On Tue, Dec 8, 2015 at 11:36 AM, Finan, Sean < > >> >> Sean.Finan@childrens.harvard.edu> wrote: > >> >> > >> >> > Hi Alaa, > >> >> > > >> >> > The -shortest- answer is that you'll need to run the dictionary > >> >> > creation tool. There are instructions in older devlist threads. > >> >> > By default the dictionary creation tool does add icd9 and icd10 > >> >> > tables to > >> >> the dictionary. > >> >> > The problem is that in Umls 2011AB those codes weren't very well > >> >> > populated. The 2015AB icd# set is much more rich so those > >> >> > tables should be pretty good. Then in ctakes you would look up > >> >> > annotations by icd9 or icd10 codes instead of by cui: > >> >> > OntologyConceptUtil.getAnnotationsByCode( jcas, lookupWindow, > >> >> > icd#Code ); OntologyConceptUtil.getAnnotationsByCode( jcas, > >> >> > icd#Code ); > >> >> > > >> >> > Sean > >> >> > > >> >> > -----Original Message----- > >> >> > From: Savova, Guergana > >> >> > [mailto:Guergana.Savova@childrens.harvard.edu] > >> >> > Sent: Tuesday, December 08, 2015 12:17 PM > >> >> > To: dev@ctakes.apache.org > >> >> > Subject: RE: ctakes with icd10 > >> >> > > >> >> > Hi Alaa, > >> >> > You need to create a resource off the terminology/ontology you > >> >> > want to use (in this case ICD9 or ICD10). Then run that resource > >> >> > with cTAKES for the fast dictionary lookup. There is cTAKES code > >> >> > and some documentation on how to create that resource. By > >> >> > default, cTAKES runs with a resource created from the English > >> >> > version of SNOMED CT > >> and RxNORM. > >> >> > Hope this helps. > >> >> > --Guergana > >> >> > > >> >> > -----Original Message----- > >> >> > From: Alaa al Barari [mailto:alaa.albarari@gmail.com] > >> >> > Sent: Tuesday, December 8, 2015 10:01 AM > >> >> > To: dev@ctakes.apache.org > >> >> > Subject: ctakes with icd10 > >> >> > > >> >> > Hi, > >> >> > > >> >> > I downloaded Latest umls version, and I want to know how to make > >> >> > ctakes work with icd10 and icd9. > >> >> > > >> >> > > >> >> > Thanks > >> >> > > >> >> > >> >> > >> >> IMPORTANT WARNING: The information in this message (and the > >> >> documents attached to it, if any) is confidential and may be legally > privileged. > >> >> It is intended solely for the addressee. Access to this message by > >> >> anyone else is unauthorized. If you are not the intended > >> >> recipient, any disclosure, copying, distribution or any action > >> >> taken, or omitted to be taken, in reliance on it is prohibited and > >> >> may be unlawful. If you have received this message in error, > >> >> please delete all electronic copies of this message (and the > >> >> documents attached to it, if any), destroy any hard copies you may > >> >> have created and notify me immediately > >> by replying to this email. Thank you. > >> >> > >> >> Geisinger Health System utilizes an encryption process to > >> >> safeguard Protected Health Information and other confidential data > >> >> contained in external e-mail messages. If email is encrypted, the > >> >> recipient will receive an e-mail instructing them to sign on to > >> >> the Geisinger Health System Secure E-mail Message Center to retrieve > the encrypted e-mail. > >> >> > >> > > > > > > > > -- > > Eng Alaa Al-Barari > > phone 0599297470 > > > > > > -- > Eng Alaa Al-Barari > phone 0599297470 > --001a114020ac5a7195052685f359--