www-legal-discuss mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lawrence Rosen" <lro...@rosenlaw.com>
Subject RE: Data Donation
Date Mon, 10 Nov 2008 20:26:08 GMT
Someone (or many) must own those blogs. Libraries obtain copyright clearance
from their vendors and book publishers, but you say you know that this donor
doesn't own the copyrights or have licenses. That troubles me. Would this
donor be able to sign a CLA?

If ASF becomes a publisher and distributor of others' data, we must offer
assurances to our readers that we're not infringing copyrights or
inducing/contributing to infringement by others. If ASF were to distribute
any data, I would suggest we obtain copyright clearance to do so from its
authors.

Doing our own crawl for the data is entirely different, although when we
offer it we still must be careful to limit ourselves (as Google and Yahoo
do) to appropriate copying and/or linking. For example, that is one reason
why Google News links to the original news website rather than providing you
with a copy of each news article, although clearly they have made a copy for
indexing purposes. 

Is this donor offering blogs or links to blogs?

/Larry



> -----Original Message-----
> From: Grant Ingersoll [mailto:gsingers@apache.org]
> Sent: Monday, November 10, 2008 11:54 AM
> To: legal-discuss@apache.org
> Subject: Re: Data Donation
> 
> 
> On Nov 10, 2008, at 12:16 PM, Lawrence Rosen wrote:
> 
> > Grant Ingersoll asked:
> >> At any rate, what do others think?  Would it be an issue if the ASF
> >> distributed this data?
> >
> > If ASF were to distribute certain blog data ourselves, we would have
> > to take
> > reasonable steps to honor whatever copyrights exist in that data. Do I
> > understand correctly that this is an option that we're *not*
> > interested in
> > pursuing?
> 
> Well, we could go do a crawl ourselves using a something like Nutch
> that honors robots.txt, but that takes up a lot of bandwidth, time and
> effort, so, I personally am not interested in doing it when someone
> else has and is willing to donate it to us.
> 
> >
> >
> > We shouldn't be concerned about reading anything at all that someone
> > else
> > makes accessible in a public way without restriction. That does not
> > involve
> > our copying and distribution, except peripherally through our
> > individual
> > browsers. That is like reading a book in a library; we don't concern
> > ourselves whether the library itself has honored copyrights.
> >
> > If I'm misunderstanding this offer, please explain.
> 
> I'm not concerned about reading, I'm concerned about writing/copying.
> In other words, are we at risk if we make a copy of their donated data
> (which they do not have copyright for) and we then redistribute it.  I
> think your library argument says we are not at risk and I tend to
> agree.  However, this seems slightly different, since, in the library
> case, they are not making a copy but are instead providing access to a
> copy.  Whereas we are making a copy, it seems, and providing an easy
> means for others to do so as well.  Of course, I suppose by that
> logic, one could argue that the ISPs that host the blogs are making
> copies too, and that just seems like a slippery slope.
> 
> Just to be clear, I would very much like to distribute this data via
> the ASF and want to know if there is any obvious reason not to proceed
> or if there is anything in particular I should do before moving forward.
> 
> 
> >
> >
> > /Larry
> >
> >
> >
> >> -----Original Message-----
> >> From: Grant Ingersoll [mailto:gsingers@apache.org]
> >> Sent: Monday, November 10, 2008 7:46 AM
> >> To: legal-discuss@apache.org
> >> Subject: Data Donation
> >>
> >> Hi,
> >>
> >> I have a contact from a company that distributes blog data who is
> >> willing to donate/provide access to the data without restrictions.
> >> It's about 50 GBs and is of limited commercial value to them, since
> >> it
> >> is older data.  It, however, useful to us in Lucene.  The main
> >> concern
> >> I have at this point, that I can't quite get my head, is the notion
> >> of
> >> copyright on the data.  There are two scenarios, I think:
> >>
> >> 1. They host the data and we merely link to it.
> >>
> >> 2. We host the data and make it available to all.
> >>
> >> In case #1, I don't think there is really an issue w/ copyright,
> >> since
> >> people are downloading it themselves, we are just providing a link.
> >> In case #2, it seems a little fuzzier.  The company explicitly tells
> >> their customers that it is up to the person downloading to respect
> >> the
> >> copyright laws of their jurisdiction.  In other words, they are
> >> merely
> >> facilitating access.  I think the case is similar to the one that
> >> Google makes in terms of their caching of webpages.   I don't know if
> >> this argument is just putting their head in the sand or not.  I
> >> suppose it would come down to whether or not a blogger would sue over
> >> inclusion of their site in the collection.
> >>
> >> As of now, let's assume the company honors robots.txt when crawling,
> >> if that matters at all in your responses.  I don't know if they do or
> >> not, but I would guess they do.
> >>
> >> At any rate, what do others think?  Would it be an issue if the ASF
> >> distributed this data?
> >>
> >> -Grant
> >>
> 
> ---------------------------------------------------------------------
> DISCLAIMER: Discussions on this list are informational and educational
> only.  Statements made on this list are not privileged, do not
> constitute legal advice, and do not necessarily reflect the opinions
> and policies of the ASF.  See <http://www.apache.org/licenses/> for
> official ASF policies and documents.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
> For additional commands, e-mail: legal-discuss-help@apache.org


---------------------------------------------------------------------
DISCLAIMER: Discussions on this list are informational and educational
only.  Statements made on this list are not privileged, do not
constitute legal advice, and do not necessarily reflect the opinions
and policies of the ASF.  See <http://www.apache.org/licenses/> for
official ASF policies and documents.
---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org


Mime
View raw message