uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neal R Lewis <nrle...@us.ibm.com>
Subject Re: Public Gold Standards?
Date Mon, 25 Jun 2012 16:28:54 GMT

HI Leonard,

We usually have to build our own gold standards, depending on what we're
looking for.

What we use for clinical documents is mtsamples.  http://www.mtsamples.com
These are medically transcribed notes from multiple disciplines.  They are
de-identified, but not annotated.

Another option, if you're looking for gold standards is to check out i2b2:

I haven't used their datasets, so I'm not exactly sure how to get them, but
I think if you register you might be able to grab datasets for smoking,
medications, and relationships.

Good luck,


From:	Leonard Jacuzzo <jacuzzo@gmail.com>
To:	user@uima.apache.org
Date:	06/22/2012 07:06 PM
Subject:	Public Gold Standards?

Hi I know this is not a UIMA specific question, but I am exploring NLP and

But I don't have the resources to develop a Medical Gold Standard set of
annotated documents. To do any real exploration, I need one these.

Does anyone on this list know where I can obtain de-identified gold
standard documents with which to test my set ups?

Any help will be greatly appreciated.

Best wishes,

  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message