Return-Path: X-Original-To: apmail-legal-discuss-archive@www.apache.org Delivered-To: apmail-legal-discuss-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2BAA04993 for ; Thu, 19 May 2011 06:39:38 +0000 (UTC) Received: (qmail 54087 invoked by uid 500); 19 May 2011 06:39:35 -0000 Delivered-To: apmail-legal-discuss-archive@apache.org Received: (qmail 53608 invoked by uid 500); 19 May 2011 06:39:34 -0000 Mailing-List: contact legal-discuss-help@apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: Reply-To: legal-discuss@apache.org List-Id: Delivered-To: mailing list legal-discuss@apache.org Received: (qmail 53232 invoked by uid 99); 19 May 2011 06:39:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 May 2011 06:39:31 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ralph.goers@dslextreme.com designates 209.85.212.173 as permitted sender) Received: from [209.85.212.173] (HELO mail-px0-f173.google.com) (209.85.212.173) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 May 2011 06:39:24 +0000 Received: by pxi16 with SMTP id 16so1549143pxi.32 for ; Wed, 18 May 2011 23:39:03 -0700 (PDT) Received: by 10.143.97.7 with SMTP id z7mr1612476wfl.9.1305787141379; Wed, 18 May 2011 23:39:01 -0700 (PDT) Received: from [192.168.10.132] (cpe-75-82-178-177.socal.res.rr.com [75.82.178.177]) by mx.google.com with ESMTPS id w18sm1072170wfd.3.2011.05.18.23.38.59 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 18 May 2011 23:39:00 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1084) Subject: Re: [jira] [Created] (LEGAL-90) What are the licensing implications for statistical information drawn from non-ASL2-licensed data, e.g. word frequency lists from Wikipedia dumps? From: Ralph Goers In-Reply-To: <095301cc15b2$b503f840$1f0be8c0$@com> Date: Wed, 18 May 2011 23:38:58 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <70DC9CB1-788B-4DC4-90BD-FC8F51217777@dslextreme.com> References: <896172374.24047.1305754127586.JavaMail.tomcat@hel.zones.apache.org> <095301cc15b2$b503f840$1f0be8c0$@com> To: legal-discuss@apache.org X-Mailer: Apple Mail (2.1084) X-Virus-Checked: Checked by ClamAV on apache.org I'm not sure if you are aware that you are probably not answering the = author in a manner that is visible to him. He asked his question in = Jira - which automatically sends an email here. He may not be subscribed = to this list and your answer won't automatically be forwarded to Jira. Ralph On May 18, 2011, at 4:24 PM, Lawrence Rosen wrote: > Steven Rowe asked: >> What are the licensing implications for statistical information drawn >> from non-ASL2-licensed data, e.g. word frequency lists from Wikipedia >> dumps? > and=20 >> I'm also interested in the more general question, as posed in the = issue >> summary: do the licenses covering arbitrary data, text or otherwise, >> have any bearing on stastical products created over the data? >=20 > Interesting questions.=20 >=20 > Perhaps you could argue the fair use factors in 17 USC 107 to conclude = that your transformations of those copyrighted works are fair use for = scholarship or research purposes? For example, building a word index and = word count for Shakespeare's plays used to be an important way to = analyze whether the same person wrote all the works. Of course = Shakespeare is public domain nowadays, so the example isn't precisely on = point. >=20 > These are the fair use factors: >=20 > (1) the purpose and character of the use, including whether such use = is of a commercial nature or is for nonprofit educational purposes; >=20 > (2) the nature of the copyrighted work; >=20 > (3) the amount and substantiality of the portion used in relation to = the copyrighted work as a whole; and >=20 > (4) the effect of the use upon the potential market for or value of = the copyrighted work. >=20 > You might also argue that a statistical transformation of a work = doesn't create a copyrightable work, hence it is not even a derivative = work. I'm not sure what it is.... Perhaps just a set of numbers that = means something only to a statistician? Is the reduced data an = "expressive work"? >=20 > /Larry >=20 >=20 >> -----Original Message----- >> From: Steven Rowe (JIRA) [mailto:jira@apache.org] >> Sent: Wednesday, May 18, 2011 2:29 PM >> To: legal-discuss@apache.org >> Subject: [jira] [Created] (LEGAL-90) What are the licensing >> implications for statistical information drawn from non-ASL2-licensed >> data, e.g. word frequency lists from Wikipedia dumps? >>=20 >> What are the licensing implications for statistical information drawn >> from non-ASL2-licensed data, e.g. word frequency lists from Wikipedia >> dumps? >> = ----------------------------------------------------------------------- >> = ----------------------------------------------------------------------- >> ---- >>=20 >> Key: LEGAL-90 >> URL: https://issues.apache.org/jira/browse/LEGAL-90 >> Project: Legal Discuss >> Issue Type: Question >> Reporter: Steven Rowe >>=20 >>=20 >> I have generated word frequency lists from full Wikipedia dumps in >> several languages. For the purposes of inclusion in ASL2-licensed >> products, do I need to care about the license(s) covering the = original >> text? >>=20 >> My interpretation (IANAL) of the [Creative Commons Attribution- >> ShareAlike 3.0 Unported = license|http://creativecommons.org/licenses/by- >> sa/3.0/legalcode], under which [Wikipedia text is >> licensed|http://wikimediafoundation.org/wiki/Terms_of_Use], is that = the >> license applies only to the Covered Works, Adaptations, and >> Collections, and that a word frequency list qualifies as none of = these: >> Adaptations are "recognizably derived from the original"; and >> Collections "the Work is included in its entirety in unmodified form >> along with one or more other contributions". >>=20 >> My interpretation of the answer to the resolved question ["Can Apache >> projects include Creative Commons Attribution-Share Alike >> works?"|http://www.apache.org/legal/resolved.html#cc-sa] is that even >> if the CC-SA license applies to my word frequency lists, I can still >> include them in an ASL2-licensed product, as long as attribution is >> provided. >>=20 >> I'm also interested in the more general question, as posed in the = issue >> summary: do the licenses covering arbitrary data, text or otherwise, >> have any bearing on stastical products created over the data? >>=20 >> -- >> This message is automatically generated by JIRA. >> For more information on JIRA, see: >> http://www.atlassian.com/software/jira >>=20 >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org >> For additional commands, e-mail: legal-discuss-help@apache.org >=20 >=20 >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org > For additional commands, e-mail: legal-discuss-help@apache.org >=20 --------------------------------------------------------------------- To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org For additional commands, e-mail: legal-discuss-help@apache.org