ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Green" <john.travis.gr...@gmail.com>
Subject Re: Announcement: UMLS MedGen-MySQL dataset now available as open access download
Date Thu, 13 Nov 2014 18:53:47 GMT
The old licensed setup would be kept as a packaged option? Much as it is now.... With the unlicensed
going out in place of the current "free" dictionary? Am I understanding that right? 


JG
—
Sent from Mailbox

On Thu, Nov 13, 2014 at 1:40 PM, andy mcmurry <mcmurry.andy@gmail.com>
wrote:

> I'll crunch the numbers -- in the meantime I can tell you that phenotypes
> vary by semantic type. clinical attributes  from SNOMED are abundant, many
> concepts in mesh that are mapped to diseases. Tons of "pharmacological
> substances"
> On Nov 12, 2014 6:19 AM, "Dligach, Dmitriy" <
> Dmitriy.Dligach@childrens.harvard.edu> wrote:
>> Andy, thank you for this resource!
>>
>> Do you have an estimate of what percentage of UMLS concepts were left out?
>>
>> Dima
>>
>>
>>
>>
>> On Nov 11, 2014, at 16:02, andy mcmurry <mcmurry.andy@gmail.com> wrote:
>>
>> > Hello!
>> >
>> > https://bitbucket.org/invitae/medgen-mysql (Apache Licensed ASL2)
>> >
>> > We just released a new library containing a huge chunk of UMLS concepts
>> > which are available without registering accounts/username/passwords.
>> > LEGALLY. Yes, really!
>> >
>> > The subset is from NCBI and it contains *thousands of concepts from
>> SNOMED
>> > and other vocabularies*.
>> >
>> > The code is essentially
>> > 1. a list of WGET targets to various NCBI FTP site mirrors
>> > 2. Makefile for building the databases of interest
>> >
>> > Our legal team has approved distribution for Open Access work, ASL2
>> > LICENSE.
>> >
>> > I recommend we use this opportunity to make this the default distribution
>> > for CTAKES UMLS connections, because it obviates the need for so much
>> > painful credentialing and back and forth agreements with the US National
>> > Library of Medicine.
>> >
>> > Cheers!
>> > --Andy
>> >
>> >
>> > On Wed, Sep 10, 2014 at 12:13 PM, Masanz, James J. <
>> Masanz.James@mayo.edu>
>> > wrote:
>> >
>> >>
>> >> I would love to see the install be as simple as apt-get install to end
>> up
>> >> with some working dictionary that have more than a handful of entries to
>> >> get them started.
>> >>
>> >> Regards,
>> >> James Masanz
>> >>
>> >> -----Original Message-----
>> >> From: andy mcmurry [mailto:mcmurry.andy@gmail.com]
>> >> Sent: Tuesday, September 09, 2014 4:32 PM
>> >> To: ctakes-dev@incubator.apache.org
>> >> Subject: Recommendation for ctakes default (UMLS) dictionaries
>> >>
>> >> Greetings ctakes-dev:
>> >>
>> >> *UMLS license restrictions have been getting more lax over the years --
>> >> *much of the UMLS can be downloaded directly from the NCBI official FTP
>> >> site.
>> >>
>> >> In fact, the NIH (and implicitly the NLM) *have already made the
>> standard
>> >> terms public for some medical specialities*.
>> >>
>> >> For example: Here is the UMLS subset specific to Medical Genetics
>> (MedGen)
>> >> and Genetic Testing (GTR) complete with SNOMED-CT concept CUI(s) and
>> names,
>> >> etc :
>> >>
>> >> [  ftp://ftp.ncbi.nlm.nih.gov/pub/medgen/README.html  ]
>> >>
>> >> My team has developed a JVM based wrapper for MetaMap 2013AB which I
>> >> intend to open source soon (Clojure).  It includes REST support for
>> >> invoking MetaMap with any or all of the command line arguments.
>> >> We do not integrate with UIMA, we are basically a wrapper around the
>> >> binary installation of MetaMap. The emphasis is on publication text not
>> >> clinical text, still, some services are common (such as LVG).
>> >>
>> >> Strangely, the NLM still requires UMLS licenses to download MetaMap
>> >> execution binaries. The MetaMap binary install is better but customizing
>> >> dictionaries (DataFileBuilder) is not as easy to use as CTAKES with
>> YTEXT
>> >>
>> >> [ https://cwiki.apache.org/confluence/display/CTAKES/YTEX+Installation
>> ]
>> >>
>> >> *** Hence, there is a real opportunity here to enable Apache cTAKES to
>> >> have a stronger default dictionary. ** *
>> >>
>> >> Imagine if we could
>> >> *$ apt-get install apache-ctakes *
>> >>
>> >> and instantly have a working package for SOME problem domain.
>> >> In my case (Medical Genetics) the UMLS definitions are already available
>> >> and the UMLS license problem becomes a non issue, at least for many
>> first
>> >> time users
>> >>
>> >> Your thoughts?
>> >> AndyMC
>> >>
>>
>>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message