www-legal-discuss mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Data Donation
Date Sat, 15 Nov 2008 12:24:45 GMT
FYI, the vendor has agreed to give us a time limited access code and  
to allow us to access the data on their host.  Thus, people can  
download if they wish and the only thing we are providing is the  
knowledge that the company has made said data available publicly for a  
period of time.  Thus, the vendor can update the data as appropriate  
if anyone expresses concerns about some of the content and individuals  
are free to download or not and the ASF isn't available.  It's pretty  
much the same as with any pointers to data we give out.


On Nov 10, 2008, at 3:26 PM, Lawrence Rosen wrote:

> Someone (or many) must own those blogs. Libraries obtain copyright  
> clearance
> from their vendors and book publishers, but you say you know that  
> this donor
> doesn't own the copyrights or have licenses. That troubles me. Would  
> this
> donor be able to sign a CLA?
> If ASF becomes a publisher and distributor of others' data, we must  
> offer
> assurances to our readers that we're not infringing copyrights or
> inducing/contributing to infringement by others. If ASF were to  
> distribute
> any data, I would suggest we obtain copyright clearance to do so  
> from its
> authors.
> Doing our own crawl for the data is entirely different, although  
> when we
> offer it we still must be careful to limit ourselves (as Google and  
> Yahoo
> do) to appropriate copying and/or linking. For example, that is one  
> reason
> why Google News links to the original news website rather than  
> providing you
> with a copy of each news article, although clearly they have made a  
> copy for
> indexing purposes.
> Is this donor offering blogs or links to blogs?
> /Larry
>> -----Original Message-----
>> From: Grant Ingersoll [mailto:gsingers@apache.org]
>> Sent: Monday, November 10, 2008 11:54 AM
>> To: legal-discuss@apache.org
>> Subject: Re: Data Donation
>> On Nov 10, 2008, at 12:16 PM, Lawrence Rosen wrote:
>>> Grant Ingersoll asked:
>>>> At any rate, what do others think?  Would it be an issue if the ASF
>>>> distributed this data?
>>> If ASF were to distribute certain blog data ourselves, we would have
>>> to take
>>> reasonable steps to honor whatever copyrights exist in that data.  
>>> Do I
>>> understand correctly that this is an option that we're *not*
>>> interested in
>>> pursuing?
>> Well, we could go do a crawl ourselves using a something like Nutch
>> that honors robots.txt, but that takes up a lot of bandwidth, time  
>> and
>> effort, so, I personally am not interested in doing it when someone
>> else has and is willing to donate it to us.
>>> We shouldn't be concerned about reading anything at all that someone
>>> else
>>> makes accessible in a public way without restriction. That does not
>>> involve
>>> our copying and distribution, except peripherally through our
>>> individual
>>> browsers. That is like reading a book in a library; we don't concern
>>> ourselves whether the library itself has honored copyrights.
>>> If I'm misunderstanding this offer, please explain.
>> I'm not concerned about reading, I'm concerned about writing/copying.
>> In other words, are we at risk if we make a copy of their donated  
>> data
>> (which they do not have copyright for) and we then redistribute  
>> it.  I
>> think your library argument says we are not at risk and I tend to
>> agree.  However, this seems slightly different, since, in the library
>> case, they are not making a copy but are instead providing access  
>> to a
>> copy.  Whereas we are making a copy, it seems, and providing an easy
>> means for others to do so as well.  Of course, I suppose by that
>> logic, one could argue that the ISPs that host the blogs are making
>> copies too, and that just seems like a slippery slope.
>> Just to be clear, I would very much like to distribute this data via
>> the ASF and want to know if there is any obvious reason not to  
>> proceed
>> or if there is anything in particular I should do before moving  
>> forward.
>>> /Larry
>>>> -----Original Message-----
>>>> From: Grant Ingersoll [mailto:gsingers@apache.org]
>>>> Sent: Monday, November 10, 2008 7:46 AM
>>>> To: legal-discuss@apache.org
>>>> Subject: Data Donation
>>>> Hi,
>>>> I have a contact from a company that distributes blog data who is
>>>> willing to donate/provide access to the data without restrictions.
>>>> It's about 50 GBs and is of limited commercial value to them, since
>>>> it
>>>> is older data.  It, however, useful to us in Lucene.  The main
>>>> concern
>>>> I have at this point, that I can't quite get my head, is the notion
>>>> of
>>>> copyright on the data.  There are two scenarios, I think:
>>>> 1. They host the data and we merely link to it.
>>>> 2. We host the data and make it available to all.
>>>> In case #1, I don't think there is really an issue w/ copyright,
>>>> since
>>>> people are downloading it themselves, we are just providing a link.
>>>> In case #2, it seems a little fuzzier.  The company explicitly  
>>>> tells
>>>> their customers that it is up to the person downloading to respect
>>>> the
>>>> copyright laws of their jurisdiction.  In other words, they are
>>>> merely
>>>> facilitating access.  I think the case is similar to the one that
>>>> Google makes in terms of their caching of webpages.   I don't  
>>>> know if
>>>> this argument is just putting their head in the sand or not.  I
>>>> suppose it would come down to whether or not a blogger would sue  
>>>> over
>>>> inclusion of their site in the collection.
>>>> As of now, let's assume the company honors robots.txt when  
>>>> crawling,
>>>> if that matters at all in your responses.  I don't know if they  
>>>> do or
>>>> not, but I would guess they do.
>>>> At any rate, what do others think?  Would it be an issue if the ASF
>>>> distributed this data?
>>>> -Grant

DISCLAIMER: Discussions on this list are informational and educational
only.  Statements made on this list are not privileged, do not
constitute legal advice, and do not necessarily reflect the opinions
and policies of the ASF.  See <http://www.apache.org/licenses/> for
official ASF policies and documents.
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org

View raw message