Return-Path: X-Original-To: apmail-ctakes-dev-archive@www.apache.org Delivered-To: apmail-ctakes-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 80C1611050 for ; Thu, 4 Sep 2014 11:34:37 +0000 (UTC) Received: (qmail 57412 invoked by uid 500); 4 Sep 2014 11:34:37 -0000 Delivered-To: apmail-ctakes-dev-archive@ctakes.apache.org Received: (qmail 57362 invoked by uid 500); 4 Sep 2014 11:34:37 -0000 Mailing-List: contact dev-help@ctakes.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ctakes.apache.org Delivered-To: mailing list dev@ctakes.apache.org Received: (qmail 57350 invoked by uid 99); 4 Sep 2014 11:34:36 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Sep 2014 11:34:36 +0000 X-ASF-Spam-Status: No, hits=-1.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_SOFTFAIL X-Spam-Check-By: apache.org Received-SPF: softfail (nike.apache.org: transitioning domain of psz@mit.edu does not designate 128.30.2.149 as permitted sender) Received: from [128.30.2.149] (HELO outgoing.csail.mit.edu) (128.30.2.149) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Sep 2014 11:34:10 +0000 Received: from pool-72-93-206-163.bstnma.fios.verizon.net ([72.93.206.163] helo=feynman.home) by outgoing.csail.mit.edu with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1XPVIn-0003NH-EP for dev@ctakes.apache.org; Thu, 04 Sep 2014 07:34:09 -0400 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: to map UMLS CUI with normalized form From: Peter Szolovits In-Reply-To: Date: Thu, 4 Sep 2014 07:34:07 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <42D64FF4-1DC2-42E4-805D-A5FD968EA048@mit.edu> References: To: dev@ctakes.apache.org X-Mailer: Apple Mail (2.1878.6) X-Virus-Checked: Checked by ClamAV on apache.org You need to (a) get a (free) license to use UMLS, then (b) download the = (large) distribution, and (c) install it in a local database. See = https://uts.nlm.nih.gov On Sep 4, 2014, at 5:33 AM, Prakash Poudyal = wrote: > Hi Peter, >=20 > Thanks for wonderful. Can you tell me how to get this database. >=20 >=20 > On Mon, Sep 1, 2014 at 10:32 PM, Peter Szolovits wrote: >=20 >> A single CUI may have many different preferred names in different >> vocabularies. If you have a mysql version of UMLS installed, you can = do >> something like >> CREATE VIEW pname AS >> select cui, lat, ts, lui, stt, sui, ispref, aui, saui, scui, sdui, = sab, >> tty, code, str, srl, suppress, cvf >> from `mrconso` >> where (ts =3D 'P') >> and (stt =3D 'PF') >> and (ispref =3D 'Y') >> and (lat =3D 'ENG')) >>=20 >> to define the preferred terms for each CUI. I.e., this gives a = subset of >> MRCONSO containing only the preferred name of each concept. As you = can see >> from the table below, some vocabularies provide many more of these = than >> others. Then, to find that preferred term for a concept, you can = just do >> something like: >> mysql> select str from pname where cui=3D'C1141949'; >> +----------------------+ >> | STR | >> +----------------------+ >> | Troponin I increased | >> +----------------------+ >> 1 row in set (0.00 sec) >>=20 >> but >>=20 >> mysql> select distinct str from mrconso where cui=3D'C1141949'; >> +---------------------------------+ >> | str | >> +---------------------------------+ >> | Troponin I zv=C3=BD=C5=A1en=C3=BD | >> | troponine I verhoogd | >> | Troponin I increased | >> | Troponine I augment=C3=A9e | >> | Troponin I erhoeht | >> | Troponin I emelkedett | >> | Troponina I aumentata | >> | =E3=83=88=E3=83=AD=E3=83=9D=E3=83=8B=E3=83=B3=EF=BC=A9=E5=A2=97=E5=8A= =A0 | >> | =EF=BE=84=EF=BE=9B=EF=BE=8E=EF=BE=9F=EF=BE=86=EF=BE=9DI=EF=BD=BF=EF=BE= =9E=EF=BD=B3=EF=BD=B6 | >> | Troponina I aumentada | >> +---------------------------------+ >> 10 rows in set (0.00 sec) >>=20 >> In this case, the source of ambiguity is only from different = languages, >> but it could also be from the same CUI appearing in different SABs >> (vocabularies) with different names. >>=20 >> mysql> select sab, count(*) c from pname group by sab order by c desc >> limit 40; >> +---------------+--------+ >> | SAB | c | >> +---------------+--------+ >> | NCBI | 788995 | >> | MSH | 308591 | >> | MEDCIN | 253291 | >> | SNOMEDCT | 228656 | >> | RXNORM | 220148 | >> | ICD10PCS | 178093 | >> | MTH | 157083 | >> | LNC | 131930 | >> | ICD10CM | 81082 | >> | FMA | 72645 | >> | GO | 57326 | >> | OMIM | 46058 | >> | NCI | 40962 | >> | RCD | 34638 | >> | MDR | 24144 | >> | ICPC2ICD10ENG | 23618 | >> | MMX | 22387 | >> | CPT | 20064 | >> | MMSL | 19419 | >> | UMD | 15386 | >> | NDDF | 14635 | >> | VANDF | 12483 | >> | SNMI | 11222 | >> | NIC | 10487 | >> | MTHSPL | 9978 | >> | NDFRT | 9764 | >> | ICD10AM | 9098 | >> | CCPSS | 8226 | >> | MTHFDA | 6545 | >> | ICD9CM | 6521 | >> | AOD | 6513 | >> | RCDSY | 6275 | >> | HCPCS | 5200 | >> | HL7V3.0 | 5097 | >> | PDQ | 4941 | >> | MDDB | 4938 | >> | MTHICD9 | 4721 | >> | CSP | 3793 | >> | GS | 3770 | >> | NOC | 3645 | >> +---------------+--------+ >> 40 rows in set (5.31 sec) >>=20 >> On Sep 1, 2014, at 4:37 PM, Prakash Poudyal = >> wrote: >>=20 >>> Hi Chen, >>>=20 >>> Thanks for mail. I may be wrong, >>>=20 >>> dizziness (normalized form) =3D C0002940 (CUI value) >>>=20 >>> I am searching a system in which if I enter C002940 than dizziness = could >>> come. Or is there any index, or dictionary for it. >>>=20 >>> If you don't understand please write me again. >>>=20 >>> Thanks >>>=20 >>> Regards >>> Prakash >>>=20 >>>=20 >>> On Mon, Sep 1, 2014 at 9:13 PM, Chen, Pei < >> Pei.Chen@childrens.harvard.edu> >>> wrote: >>>=20 >>>> Hi Prakash, >>>> Could you clarify what you mean by 'normalized form'? An example? >>>>=20 >>>> -Pei >>>>=20 >>>> Sent from my iPhone >>>>=20 >>>>> On Sep 1, 2014, at 9:16 AM, "Prakash Poudyal" < >> prakashpoudyal@gmail.com> >>>> wrote: >>>>>=20 >>>>> Hi! >>>>>=20 >>>>> I am working in cTAKES to analysis clinical document. Is it = possible >> for >>>> me >>>>> to know how the CUI code is provided to the normalized form. >>>>>=20 >>>>> Is there any dictionary or webportal that could map SONMED CT = UMLS CUI >>>>> code with normalized form. >>>>>=20 >>>>> -- >>>>>=20 >>>>> Regards >>>>> Prakash Poudyal >>>>=20 >>>=20 >>>=20 >>>=20 >>> -- >>>=20 >>> Regards >>> Prakash Poudyal >>=20 >>=20 >=20 >=20 > --=20 >=20 > Regards > Prakash Poudyal