incubator-ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Masanz, James J." <Masanz.Ja...@mayo.edu>
Subject RE: [DISCUSS] What should we do with cTAKES resources?
Date Tue, 22 Jan 2013 03:00:15 GMT

Jörn, 

Today Benson wrote the following in this post to incubator http://s.apache.org/Gz5
"I fear that cTakes needs to have an interaction with LEGAL to adopt the SpamAssassin model,
since, from a strict constructionist perspective, the source of the models is precisely what
you cannot release."

Is he just unaware of some discussion you already had with LEGAL for OpenNLP - I ask because
in the discussion below you indicated it would be OK to release models at Apache without releasing
the data the models were built from. Is there some previous post we can point to or should
I open a discussion with LEGAL about cTAKES models


-- James Masanz

> -----Original Message-----
> From: ctakes-dev-return-811-Masanz.James=mayo.edu@incubator.apache.org
> [mailto:ctakes-dev-return-811-Masanz.James=mayo.edu@incubator.apache.org]
> On Behalf Of Jörn Kottmann
> Sent: Monday, November 05, 2012 6:41 AM
> To: ctakes-dev@incubator.apache.org
> Subject: Re: [DISCUSS] What should we do with cTAKES resources?
> 
> In my opinion we should release what we can from here at Apache and only
> the resources which have an incompatible license need to be handled
> differently, e.g. external site.
> 
> Models which are trained on private clinical data can be released as long
> as the original creator decides to license them under AL 2.0. If that is
> done by a committer it should be fine to just check them in or put them on
> the website.
> 
> The wikipedia license is compatible and an index of it as well, but we
> probably need to have attributio for it in a NOTICE file, and maybe
> include the license in the LICENSE file.
> 
> Jörn
> 
> On 11/02/2012 10:46 PM, Chen, Pei wrote:
> > I think we postponed this topic previously and since the ASF code seems
> to be in decent shape now, I think it's time to revisit this discussion
> for the longer term.
> > Currently, we have the below resources bundled with our source code
> > and distribution
> >
> > -          UMLS dictionaries (hsqldb format and in lucene indexes)
> >
> > -          Models (which were okay be to release opened source) that
> have been train from various clinical data
> >
> > -          Wikipedia index
> >
> > What are our options as ASF source code, binaries, models,
> > dependencies all need to be compliant with ASL 2.0
> > (http://www.apache.org/legal/3party.html)
> >
> > 1)      Leave things as they are, but we need to confirm with the
> sources and also will probably need to seek approval from Apache Legal for
> each of the resources
> >
> > 2)      Host the resources externally such as SourceForge similar to
> OpenNLP models (http://opennlp.sourceforge.net/models-1.5/)
> >
> > a.       Single zip per release for users to download?
> >
> > Option 2 seems the least painful in terms of compliance.
> > Since 3.0.0-incubating, each resource has a fully qualified name/path
> and is read from the classpath so it should be fairly easy if we decided
> to pull it in from external sources.
> >
> > --Pei
> >
> >


Mime
View raw message