Return-Path: X-Original-To: apmail-uima-user-archive@www.apache.org Delivered-To: apmail-uima-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8ED9F9A72 for ; Mon, 25 Jun 2012 16:36:44 +0000 (UTC) Received: (qmail 4244 invoked by uid 500); 25 Jun 2012 16:36:44 -0000 Delivered-To: apmail-uima-user-archive@uima.apache.org Received: (qmail 4210 invoked by uid 500); 25 Jun 2012 16:36:43 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 4193 invoked by uid 99); 25 Jun 2012 16:36:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Jun 2012 16:36:43 +0000 X-ASF-Spam-Status: No, hits=-2.1 required=5.0 tests=FSL_RCVD_USER,HTML_IMAGE_ONLY_28,HTML_MESSAGE,RCVD_IN_DNSWL_HI,SPF_PASS,TVD_FW_GRAPHIC_NAME_MID X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of nrlewis@us.ibm.com designates 32.97.182.143 as permitted sender) Received: from [32.97.182.143] (HELO e3.ny.us.ibm.com) (32.97.182.143) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Jun 2012 16:36:34 +0000 Received: from /spool/local by e3.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 25 Jun 2012 12:36:11 -0400 Received: from d01dlp02.pok.ibm.com (9.56.224.85) by e3.ny.us.ibm.com (192.168.1.103) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 25 Jun 2012 12:29:19 -0400 Received: from d01relay01.pok.ibm.com (d01relay01.pok.ibm.com [9.56.227.233]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 3DAA86E807D for ; Mon, 25 Jun 2012 12:28:58 -0400 (EDT) Received: from d01av05.pok.ibm.com (d01av05.pok.ibm.com [9.56.224.195]) by d01relay01.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q5PGSvGX159708 for ; Mon, 25 Jun 2012 12:28:58 -0400 Received: from d01av05.pok.ibm.com (loopback [127.0.0.1]) by d01av05.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q5PGSvm5018331 for ; Mon, 25 Jun 2012 12:28:57 -0400 Received: from d01ml605.pok.ibm.com (d01ml605.pok.ibm.com [9.56.227.91]) by d01av05.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q5PGSv2g018328 for ; Mon, 25 Jun 2012 12:28:57 -0400 In-Reply-To: References: Subject: Re: Public Gold Standards? X-KeepSent: 5879848C:5822990B-88257A28:0059D896; type=4; name=$KeepSent To: user@uima.apache.org X-Mailer: Lotus Notes Release 8.5.2FP3 Aug 10, 2010 Message-ID: From: Neal R Lewis Date: Mon, 25 Jun 2012 09:28:54 -0700 X-MIMETrack: Serialize by Router on D01ML605/01/M/IBM(Release 8.5.3HF266 | January 13, 2012) at 06/25/2012 12:28:57 MIME-Version: 1.0 Content-type: multipart/related; Boundary="0__=07BBF0BBDFCA5E068f9e8a93df938690918c07BBF0BBDFCA5E06" X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12062516-8974-0000-0000-00000A819D79 --0__=07BBF0BBDFCA5E068f9e8a93df938690918c07BBF0BBDFCA5E06 Content-type: multipart/alternative; Boundary="1__=07BBF0BBDFCA5E068f9e8a93df938690918c07BBF0BBDFCA5E06" --1__=07BBF0BBDFCA5E068f9e8a93df938690918c07BBF0BBDFCA5E06 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: quoted-printable HI Leonard, We usually have to build our own gold standards, depending on what we'r= e looking for. What we use for clinical documents is mtsamples. http://www.mtsamples.= com These are medically transcribed notes from multiple disciplines. They = are de-identified, but not annotated. Another option, if you're looking for gold standards is to check out i2= b2: https://www.i2b2.org/NLP/DataSets/Main.php I haven't used their datasets, so I'm not exactly sure how to get them,= but I think if you register you might be able to grab datasets for smoking,= medications, and relationships. Good luck, Neal From: Leonard Jacuzzo To: user@uima.apache.org Date: 06/22/2012 07:06 PM Subject: Public Gold Standards? Hi I know this is not a UIMA specific question, but I am exploring NLP = and UIMA. But I don't have the resources to develop a Medical Gold Standard set o= f annotated documents. To do any real exploration, I need one these. Does anyone on this list know where I can obtain de-identified gold standard documents with which to test my set ups? Any help will be greatly appreciated. Best wishes, Leonard = --1__=07BBF0BBDFCA5E068f9e8a93df938690918c07BBF0BBDFCA5E06 Content-type: text/html; charset=US-ASCII Content-Disposition: inline Content-transfer-encoding: quoted-printable

HI Leonard,

We usually have to build our own g= old standards, depending on what we're looking for.


What we use for clinical documents= is mtsamples.  http://www.mt= samples.com   These are medically transcribed notes from multi= ple disciplines.  They are de-identified, but not annotated.

Another option, if you're looking = for gold standards is to check out i2b2: https://www.i2b2.org/NLP/DataSets/Main.php=  

I haven't used their datasets, so = I'm not exactly sure how to get them, but I think if you register you m= ight be able to grab datasets for smoking, medications, and relationshi= ps.

Good luck,

Neal

3D"InactiveLeonard Jacuzzo ---06/22/2012 07:06:29 PM---Hi I know this i= s not a UIMA specific question, but I am exploring NLP and UIMA.=

From: Leonard Jacuzzo <jacuzzo@gmail.com= >

To: user@uima.apache.org
Date: 06/22/2012 07:06 PM
Subject: = Public Gold Standards?





Hi I know this is not a UIMA specific question, bu= t I am exploring NLP and
UIMA.

But I don't have the resources to develop a Medical Gold Standard set o= f
annotated documents. To do any real exploration, I need one these.

Does anyone on this list know where I can obtain de-identified gold
= standard documents with which to test my set ups?


Any help will be greatly appreciated.

Best wishes,
Leonard

= --1__=07BBF0BBDFCA5E068f9e8a93df938690918c07BBF0BBDFCA5E06-- --0__=07BBF0BBDFCA5E068f9e8a93df938690918c07BBF0BBDFCA5E06--