incubator-ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen, Pei" <>
Subject RE: Creative Commons License (was: checking in wiki)
Date Thu, 03 Jan 2013 21:56:57 GMT
Thanks for the info Ted.  I also had the same interpretation but decided to contact the folks
at Wikipedia just to make sure-  Below was their response.

I am not an attorney, but If I am reading it correctly, I *think* we should be able include
it in the project and add the attribution to the NOTICE/LICENSE.

Do you know if this is something that we would be required to get an okay from Apache Legal?

Re: [Ticket#2013010310007005] Creative Commons License and Apache (was: checking in wiki)

Dear Chen Pei,

Thank you for your email.  Our response follows your message.

01/03/2013 16:35 - Chen Pei wrote:

> Dear Licensing@Wikipedia,

> We have an incubator project on the Apache Software Foundation and had

> a

question about reusing content from Wikipedia.

> We built a Lucene index with 5000 wikipedia articles relating to

> medicine. Each

article is modified by reducing it to list of words and their counts in that article.  Would
this term count transformation be okay from the Wikipedia license to be including inside an
ASL 2.0 project?  Is it considered new work or do we need a specific license for this purpose?


> Email discussion thread on Apache:



> Thanks,

> Pei


In principle, all text in Wikipedia is subject to the Creative Commons Attribution-ShareAlike
License (CC-BY-SA) and may be used free of charge for any purpose. Reading more about the
license should help explain it in simpler terms:

<> Images and other media files may be
subject to other licenses, which can be seen upon clicking on the desired image or file.

A specific permission for reusing the content is not necessary, as long as the re-user observes
the license conditions. CC-BY-SA allows commercial use. The only thing that needs to be done
is attribution ('BY'), which can simply be a link to the history page of an article <>,

and re-releasing the content under similar licenses <>

For more information please see:

<> or <>.

Please note: Neither the Wikimedia Foundation, nor the authors of articles on Wikimedia sites,
nor the volunteers answering mail to this address provide legal advice. It is your responsibility,
if you intend to reuse content from Wikimedia sites, to determine how the licenses of the
content that we host apply to your intended uses.

Yours sincerely,


Wikipedia -


Disclaimer: all mail to this address is answered by volunteers, and responses are not to be
considered an official statement of the Wikimedia Foundation. For official correspondence,
please contact the Wikimedia Foundation by certified mail at the address listed on

From: Ted Dunning []
Sent: Wednesday, January 02, 2013 7:38 PM
Subject: Re: Creative Commons License (was: checking in wiki)

On non-legal-binding precedent is the RCV1 corpus where Reuters agreed that

    "Summaries, analyses and interpretations of the linguistic properties of the information
may be derived and published provided it is not possible to reconstruct the Data from the

This was part of an agreement, so it has no legal binding, but it does indicate that at least
one fairly strict copyright interpreter was OK with the term count transformation.

On Wed, Jan 2, 2013 at 4:33 PM, Benson Margulies <<>>
On Wed, Jan 2, 2013 at 2:30 PM, Tim Miller
> The license is share alike 3.0, the reasons we need advice is because we are
> using modified/derived version (the clause in the legal FAQ starts
> "Unmodified media..."). Specifically, we built a lucene index with 5000
> wikipedia articles relating to medicine. Each article is modified by
> reducing it to list of words and their counts in that article. Is there some
> advice on whether this sort of modification is allowable or whether it
> disqualifies?
A language model derived from a corpus is not necessarily a derived
work of the corpus. Opinions vary. Some would tell you that it's a new
work entirely, and you own it. Others would tell you that you need a
specific license from the original content owners.

> Tim
> On 01/02/2013 11:28 AM, Jörn Kottmann wrote:
>> Hello,
>> it depends on which CA license the material is licensed under.
>> The legal FAQ clarifies it for some of them:
>> For Creative Commons Share Alike 2.5/3.0 it says:
>> "Unmodified media under the Creative Commons Attribution-Share Alike 2.5
>> and
>> Creative Commons Attribution-Share Alike 3.0 licenses may be included in
>> Apache products, subject to the licenses attribution clauses which may
>> require LICENSE/NOTICE/README changes. ...."
>> Is that the license wikipedia is licensed under?
>> Jörn
>> On 01/02/2013 05:10 PM, Chen, Pei wrote:
>>> Hi,
>>> We would like to check in some derived features/models from Wikipedia
>>> into the src code base and would like to double check - are Creative Commons
>>> Licenses compatible with ASL 2.0?
>>> We couldn't find  it in the approved 3rd party list:
>>> Thanks,
>>> Pei
>>> -----Original Message-----
>>> From: Tim Miller [<>]
>>> Sent: Monday, December 31, 2012 3:22 PM
>>> To:<>
>>> Subject: checking in wiki
>>> Hi team,
>>> I'm just about ready to check in the wikipedia small index and the new
>>> coref features and models that take advantage of them, and I want to verify
>>> what changes we need to make to the license/notice to allow this in the next
>>> release.  The NOTICE section has the dependent software included -- is it
>>> sufficient to add something like this:
>>>      This product includes contents adapted from the English-language
>>>      Wikipedia (<>) developed under
the Creative Commons
>>>      Attribution-ShareAlike 3.0 License
>>>      (
>>> Thanks
>>> Tim
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:<>
> For additional commands, e-mail:<>

To unsubscribe, e-mail:<>
For additional commands, e-mail:<>

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message