Return-Path: X-Original-To: apmail-ctakes-dev-archive@www.apache.org Delivered-To: apmail-ctakes-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8180D119CA for ; Fri, 22 Aug 2014 15:31:15 +0000 (UTC) Received: (qmail 59041 invoked by uid 500); 22 Aug 2014 15:31:15 -0000 Delivered-To: apmail-ctakes-dev-archive@ctakes.apache.org Received: (qmail 58989 invoked by uid 500); 22 Aug 2014 15:31:15 -0000 Mailing-List: contact dev-help@ctakes.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ctakes.apache.org Delivered-To: mailing list dev@ctakes.apache.org Received: (qmail 58974 invoked by uid 99); 22 Aug 2014 15:31:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Aug 2014 15:31:15 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [207.46.163.142] (HELO na01-bn1-obe.outbound.protection.outlook.com) (207.46.163.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Aug 2014 15:30:47 +0000 Received: from BLUPR08CA002.namprd08.prod.outlook.com (10.255.219.170) by BL2PR08MB417.namprd08.prod.outlook.com (10.141.92.143) with Microsoft SMTP Server (TLS) id 15.0.1005.10; Fri, 22 Aug 2014 15:30:44 +0000 Received: from BY2FFO11FD009.protection.gbl (2a01:111:f400:7c0c::181) by BLUPR08CA002.outlook.office365.com (2a01:111:e400:83f::42) with Microsoft SMTP Server (TLS) id 15.0.1015.17 via Frontend Transport; Fri, 22 Aug 2014 15:30:44 +0000 Received: from mailgate.vanderbilt.edu (129.59.15.81) by BY2FFO11FD009.mail.protection.outlook.com (10.1.14.73) with Microsoft SMTP Server (TLS) id 15.0.1010.11 via Frontend Transport; Fri, 22 Aug 2014 15:30:43 +0000 Received: from ITS-HCWNEM120.ds.vanderbilt.edu (10.1.154.70) by VUIT-HCWNEM171.vanderbilt.edu (10.1.140.91) with Microsoft SMTP Server (TLS) id 14.3.174.1; Fri, 22 Aug 2014 10:29:52 -0500 Received: from ITS-HCWNEM106.ds.vanderbilt.edu ([10.1.154.89]) by ITS-HCWNEM120.ds.vanderbilt.edu ([10.1.154.70]) with mapi id 14.03.0174.001; Fri, 22 Aug 2014 10:26:05 -0500 From: "Koola, Jejo David" To: "dev@ctakes.apache.org" Subject: Re: Acronym annotator Thread-Topic: Acronym annotator Thread-Index: AQHPvXRHdNjsDBgljEi3qd4cOAw6kpvdE64A Date: Fri, 22 Aug 2014 15:26:04 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [129.59.15.136] Content-Type: multipart/alternative; boundary="_000_BBF6B685E03F4A1A96765760C85DD716vanderbiltedu_" MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-Forefront-Antispam-Report: CIP:129.59.15.81;CTRY:US;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(428002)(24454002)(189002)(377454003)(199003)(81156004)(66066001)(80022001)(83716003)(105586002)(20776003)(110136001)(99396002)(84326002)(109096001)(106466001)(86362001)(74502001)(75432001)(512954002)(74662001)(106116001)(81342001)(4396001)(69596002)(2351001)(83072002)(64706001)(16796002)(76482001)(85852003)(77982001)(2656002)(95666004)(92566001)(84676001)(82746002)(6806004)(19617315012)(97736001)(33656002)(68736004)(81542001)(221733001)(79102001)(87936001)(88552001)(71186001)(76176999)(101416001)(92726001)(15975445006)(89122001)(31966008)(21056001)(54356999)(16236675004)(50986999)(83322001)(107886001)(44976005)(46102001)(107046002)(19580395003)(90102001)(85306004)(104396001)(2501001);DIR:OUT;SFP:;SCL:1;SRVR:BL2PR08MB417;H:mailgate.vanderbilt.edu;FPR:;MLV:sfv;PTR:InfoDomainNonexistent;MX:1;A:1;LANG:en; X-Microsoft-Antispam: BCL:0;PCL:0;RULEID:;UriScan:; X-Forefront-PRVS: 0311124FA9 Received-SPF: None (protection.outlook.com: vanderbilt.edu does not designate permitted sender hosts) Authentication-Results: spf=none (sender IP is 129.59.15.81) smtp.mailfrom=jejo.d.koola@vanderbilt.edu; X-OriginatorOrg: vanderbilt.edu X-Virus-Checked: Checked by ClamAV on apache.org --_000_BBF6B685E03F4A1A96765760C85DD716vanderbiltedu_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable You might be interested in: https://sbmi.uth.edu/ccb/resources/abbreviation= .htm On Aug 21, 2014, at 2:08 PM, John Green > wrote: Are there any acronym annotators and disambiguators? What are people doing in production elsewhere? Im learning the heart of cTakes and UIMA by the numbers right now and I think writing an annotator of my own will be the best way to solidify the information. If no one has it done already, I thought Id write a simple acronym annotator and disambiguator. The disambiguation would just be a co-occurance over a lookup window across a private corpus I have access to, e.g., word1 word 2 word3 acronym1 word4 word5 word6. I would provide specificity by excluding words that tend to occur frequently across instances of the acronyms with the same abbreviation. But, if someone has already done it and is planning on releasing it, I hate to reproduce wheels... JG --_000_BBF6B685E03F4A1A96765760C85DD716vanderbiltedu_--