Return-Path: X-Original-To: apmail-ctakes-dev-archive@www.apache.org Delivered-To: apmail-ctakes-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F22541733F for ; Tue, 30 Sep 2014 18:44:01 +0000 (UTC) Received: (qmail 92476 invoked by uid 500); 30 Sep 2014 18:44:01 -0000 Delivered-To: apmail-ctakes-dev-archive@ctakes.apache.org Received: (qmail 92427 invoked by uid 500); 30 Sep 2014 18:44:01 -0000 Mailing-List: contact dev-help@ctakes.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ctakes.apache.org Delivered-To: mailing list dev@ctakes.apache.org Received: (qmail 92415 invoked by uid 99); 30 Sep 2014 18:44:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Sep 2014 18:44:01 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of john.travis.green@gmail.com designates 209.85.192.48 as permitted sender) Received: from [209.85.192.48] (HELO mail-qg0-f48.google.com) (209.85.192.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Sep 2014 18:43:57 +0000 Received: by mail-qg0-f48.google.com with SMTP id i50so3996996qgf.21 for ; Tue, 30 Sep 2014 11:43:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:mime-version:message-id:in-reply-to:references:from:to:cc :subject:content-type; bh=NLdB0FW4JbAjDPdkS6A2PiKKd5mc1sh8RsGNd4LmLXk=; b=lOhMuwfL4ISjE94LB72WOXFmqRZHBKzba8d4bOaLvRvjfa5KJaEsuH7GH9xEi1XNkG fBkVi50YjxLociUlt3TGoU1mprjwLT5gkpqdH43G/7jMJnoQMuX/zZnpv1X2pbvvkmyn 6cgEpxinZbu7lXIvzh7K1G39P0qyqqg0TlDjOSq59NmexEJy3jgH/FfkNpQPoZxh9Kn4 rZUs+AfJW2UMoqsB5mgsIoxoNZgX6rpgrWAXsymGsi9Dl1fegrtR2I3Dz/p4Pxd2cF/j +/75CFOCHaOZf0G8dSwXgODW5S3MSXtGsh/oYzzBYvWZWzNrNGsZBMBCTi0Lva2Z8XNm jCxA== X-Received: by 10.140.101.9 with SMTP id t9mr19074862qge.67.1412102615674; Tue, 30 Sep 2014 11:43:35 -0700 (PDT) Received: from hedwig-47.prd.orcali.com (ec2-54-85-253-180.compute-1.amazonaws.com. [54.85.253.180]) by mx.google.com with ESMTPSA id d8sm14404683qam.46.2014.09.30.11.43.34 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 30 Sep 2014 11:43:35 -0700 (PDT) Date: Tue, 30 Sep 2014 11:43:35 -0700 (PDT) X-Google-Original-Date: Tue, 30 Sep 2014 18:43:34 GMT MIME-Version: 1.0 X-Mailer: Nodemailer (0.5.0; +http://www.nodemailer.com/) Message-Id: <1412102614806.91384d9f@Nodemailer> In-Reply-To: References: X-Orchestra-Oid: 444A1189-A8E3-4FC0-9BBC-0A8A1133AB4B X-Orchestra-Sig: bf0304594dde4b2919dae0cfc02f4f73cc53d506 X-Orchestra-Thrid: TEAE0E204-147D-4C4C-82EB-B244E06A87A6_1480605199005921118 X-Orchestra-Thrid-Sig: 6a2fa6ff69b79d4a193886b669710a63f1269a78 X-Orchestra-Account: a822ef5d2fdd213e4ff4395271071a145c800cfd From: "John Green" To: dev@ctakes.apache.org Cc: dev@ctakes.apache.org Subject: Re: De-identified lab tests dataset Content-Type: multipart/alternative; boundary="----Nodemailer-0.5.0-?=_1-1412102615018" X-Virus-Checked: Checked by ClamAV on apache.org ------Nodemailer-0.5.0-?=_1-1412102615018 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable I could pull a dozen or so =22sets=22 of labs from my own personal bank of = notes that contain various forms of what you would usually call the lab = section of a soap note with minimal effort .... I dont mind, might take me = a couple of days with work tempo as it is. Its probably all from of two = different emr's total though with a handfull of written values in short = hand=C2=A0(E.g the classic fishbones used for like bnp and cbc), so not a = lot of variability but maybe enough to start compiling regex's with. If thats helpful and no one else comes along with some free data of a = larger sort... Also, there are about 10 notes I commited to the project a year or so ago = as examples that may have lab data in them. JG =E2=80=94 Sent from Mailbox On Tue, Sep 30, 2014 at 8:25 AM, Ajay Jain wrote: > John, > I am in the initial stages of my project and I'll take whatever dataset = you are able to provide without spending a lot of effort extracting it.=20 > Thanks. > Ajay > Sent from my iPhone >> On Sep 30, 2014, at 5:22 AM, =22John Green=22 wrote: >>=20 >> How large=3F And across how many EMRs=3F=20 >>=20 >>=20 >> JG >> =E2=80=94 >> Sent from Mailbox >>=20 >> On Mon, Sep 29, 2014 at 6:58 PM, Ajay Jain >> wrote: >>=20 >>> Sorry, I wasn't clear. I am working on a related project and trying to = figure out if the code can be repurposed for a lab mention annotator for = cTAKES. From what I have seen, test names from different institutions are = not standardized which makes it hard to standardize the resulting = annotation. Getting access to a larger lab tests dataset (structured) will = help me fine tune the model.=20 >>>=20 >>> Hope this helps.=20 >>> Ajay >>> Sent from my iPhone >>>> On Sep 29, 2014, at 2:12 PM, =22Savova, Guergana=22 wrote: >>>>=20 >>>> Ajay, >>>> cTAKES currently does not implement a method to discover labs from the= text. The motivation is that you can get that easily from the structured = part of the EMR (what Pete explained below). Hope this makes sense! >>>> --Guergana >>>>=20 >>>> -----Original Message----- >>>> From: Peter Szolovits [mailto:psz@mit.edu]=20 >>>> Sent: Monday, September 29, 2014 2:32 PM >>>> To: dev@ctakes.apache.org >>>> Subject: Re: De-identified lab tests dataset >>>>=20 >>>> Ajay, I'm confused by your query. cTakes is good at interpreting text= , but most lab test results are reported in tabular form that is most = appropriately searched by SQL queries. Sometimes lab results are also = reported in narrative notes, but parsing those is often more a matter of = deciphering the text structure of tables than of parsing real English text.= What am I misunderstanding=3F >>>>=20 >>>> --Pete Sz. >>>>=20 >>>>> On Sep 29, 2014, at 2:25 PM, Ajay Jain = wrote: >>>>>=20 >>>>> Hello All, >>>>>=20 >>>>> I am working on a use case for lab tests data using cTAKES and my=20 >>>>> online search to find a test dataset has been futile. I'll = greatly=20 >>>>> appreciate if someone can share such a dataset or can point me in = the=20 >>>>> right direction to go looking for one. >>>>>=20 >>>>> Best, >>>>> Ajay >>>>>=20 >>>>> -- >>>>> Founder & CEO >>>>> Mobile Insights, Inc. >>>>> (630) 408-8623 ------Nodemailer-0.5.0-?=_1-1412102615018--