ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From andy mcmurry <mcmurry.a...@gmail.com>
Subject Re: Announcement: UMLS MedGen-MySQL dataset now available as open access download
Date Thu, 13 Nov 2014 18:39:13 GMT
I'll crunch the numbers -- in the meantime I can tell you that phenotypes
vary by semantic type. clinical attributes  from SNOMED are abundant, many
concepts in mesh that are mapped to diseases. Tons of "pharmacological
substances"
On Nov 12, 2014 6:19 AM, "Dligach, Dmitriy" <
Dmitriy.Dligach@childrens.harvard.edu> wrote:

> Andy, thank you for this resource!
>
> Do you have an estimate of what percentage of UMLS concepts were left out?
>
> Dima
>
>
>
>
> On Nov 11, 2014, at 16:02, andy mcmurry <mcmurry.andy@gmail.com> wrote:
>
> > Hello!
> >
> > https://bitbucket.org/invitae/medgen-mysql (Apache Licensed ASL2)
> >
> > We just released a new library containing a huge chunk of UMLS concepts
> > which are available without registering accounts/username/passwords.
> > LEGALLY. Yes, really!
> >
> > The subset is from NCBI and it contains *thousands of concepts from
> SNOMED
> > and other vocabularies*.
> >
> > The code is essentially
> > 1. a list of WGET targets to various NCBI FTP site mirrors
> > 2. Makefile for building the databases of interest
> >
> > Our legal team has approved distribution for Open Access work, ASL2
> > LICENSE.
> >
> > I recommend we use this opportunity to make this the default distribution
> > for CTAKES UMLS connections, because it obviates the need for so much
> > painful credentialing and back and forth agreements with the US National
> > Library of Medicine.
> >
> > Cheers!
> > --Andy
> >
> >
> > On Wed, Sep 10, 2014 at 12:13 PM, Masanz, James J. <
> Masanz.James@mayo.edu>
> > wrote:
> >
> >>
> >> I would love to see the install be as simple as apt-get install to end
> up
> >> with some working dictionary that have more than a handful of entries to
> >> get them started.
> >>
> >> Regards,
> >> James Masanz
> >>
> >> -----Original Message-----
> >> From: andy mcmurry [mailto:mcmurry.andy@gmail.com]
> >> Sent: Tuesday, September 09, 2014 4:32 PM
> >> To: ctakes-dev@incubator.apache.org
> >> Subject: Recommendation for ctakes default (UMLS) dictionaries
> >>
> >> Greetings ctakes-dev:
> >>
> >> *UMLS license restrictions have been getting more lax over the years --
> >> *much of the UMLS can be downloaded directly from the NCBI official FTP
> >> site.
> >>
> >> In fact, the NIH (and implicitly the NLM) *have already made the
> standard
> >> terms public for some medical specialities*.
> >>
> >> For example: Here is the UMLS subset specific to Medical Genetics
> (MedGen)
> >> and Genetic Testing (GTR) complete with SNOMED-CT concept CUI(s) and
> names,
> >> etc :
> >>
> >> [  ftp://ftp.ncbi.nlm.nih.gov/pub/medgen/README.html  ]
> >>
> >> My team has developed a JVM based wrapper for MetaMap 2013AB which I
> >> intend to open source soon (Clojure).  It includes REST support for
> >> invoking MetaMap with any or all of the command line arguments.
> >> We do not integrate with UIMA, we are basically a wrapper around the
> >> binary installation of MetaMap. The emphasis is on publication text not
> >> clinical text, still, some services are common (such as LVG).
> >>
> >> Strangely, the NLM still requires UMLS licenses to download MetaMap
> >> execution binaries. The MetaMap binary install is better but customizing
> >> dictionaries (DataFileBuilder) is not as easy to use as CTAKES with
> YTEXT
> >>
> >> [ https://cwiki.apache.org/confluence/display/CTAKES/YTEX+Installation
> ]
> >>
> >> *** Hence, there is a real opportunity here to enable Apache cTAKES to
> >> have a stronger default dictionary. ** *
> >>
> >> Imagine if we could
> >> *$ apt-get install apache-ctakes *
> >>
> >> and instantly have a working package for SOME problem domain.
> >> In my case (Medical Genetics) the UMLS definitions are already available
> >> and the UMLS license problem becomes a non issue, at least for many
> first
> >> time users
> >>
> >> Your thoughts?
> >> AndyMC
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message