uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neal R Lewis <nrle...@us.ibm.com>
Subject Re: Public Gold Standards?
Date Mon, 25 Jun 2012 16:28:54 GMT

HI Leonard,

We usually have to build our own gold standards, depending on what we're
looking for.


What we use for clinical documents is mtsamples.  http://www.mtsamples.com
These are medically transcribed notes from multiple disciplines.  They are
de-identified, but not annotated.

Another option, if you're looking for gold standards is to check out i2b2:
https://www.i2b2.org/NLP/DataSets/Main.php

I haven't used their datasets, so I'm not exactly sure how to get them, but
I think if you register you might be able to grab datasets for smoking,
medications, and relationships.

Good luck,

Neal



From:	Leonard Jacuzzo <jacuzzo@gmail.com>
To:	user@uima.apache.org
Date:	06/22/2012 07:06 PM
Subject:	Public Gold Standards?



Hi I know this is not a UIMA specific question, but I am exploring NLP and
UIMA.

But I don't have the resources to develop a Medical Gold Standard set of
annotated documents. To do any real exploration, I need one these.

Does anyone on this list know where I can obtain de-identified gold
standard documents with which to test my set ups?


Any help will be greatly appreciated.

Best wishes,
Leonard

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message