incubator-ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Kottmann <kottm...@gmail.com>
Subject Re: [DISCUSS] What should we do with cTAKES resources?
Date Tue, 22 Jan 2013 10:35:12 GMT
On 01/22/2013 04:00 AM, Masanz, James J. wrote:
> Jörn,
>
> Today Benson wrote the following in this post to incubator http://s.apache.org/Gz5
> "I fear that cTakes needs to have an interaction with LEGAL to adopt the SpamAssassin
model, since, from a strict constructionist perspective, the source of the models is precisely
what you cannot release."
>
> Is he just unaware of some discussion you already had with LEGAL for OpenNLP - I ask
because in the discussion below you indicated it would be OK to release models at Apache without
releasing the data the models were built from. Is there some previous post we can point to
or should I open a discussion with LEGAL about cTAKES models
>
>

I was under the assumption that it is ok the just release the model and 
not the training data under AL 2.0 here at Apache,
over at UIMA we had a similar discussion for French POS Tagger 
(UIMA-2146). There the concern was that its very cumbersome
to train again on the data, but not that it can't be released.

To circumvent this particular issue it should be possible to release the 
models outside of Apache and then just redistribute
them as class A dependency in the cTAKES binary distribution.

Jörn

Mime
View raw message