hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Foley <ma...@yahoo-inc.com>
Subject Re: [VOTE] Shall we adopt the "Defining Hadoop" page
Date Thu, 16 Jun 2011 17:38:48 GMT
After writing my note to Eric, I realize that Eli and I are guilty of the same attempt
to use legal terminology in an engineering context.  Craig Russell is absolutely right.
If you change one bit, it is a "derived work".

However, we can still allow the trademark to be applied to that work, if it 
meets licensing criteria.  So what we are arguing about is, "Where is the boundary
line between something we are willing to call 'Apache Hadoop' and something
that must be called 'Product XYZ Powered by Apache Hadoop'?"

I'm in favor of a very strict definition.  It needs to be really, really close to a
PMC-approved release.  But I'm open to the argument that a small number
of security patches could be necessary for a viable commercial product,
and that shouldn't necessarily prevent it from using the trademark.

But I suggest we stop focusing on the term "derived work".  Note that the 
"Defining Apache Hadoop" draft document we are voting on doesn't use 
that term.


On Jun 16, 2011, at 9:05 AM, Eli Collins wrote:

On Wed, Jun 15, 2011 at 6:17 PM, Matthew Foley <mattf@yahoo-inc.com> wrote:
> I tend to agree with what I think you are saying, that
>        * applying a small-number-of-patches that are
>        * for high-severity-bug-fixes, and
>        * have been Apache-Hadoop-committed
> to an Apache Hadoop release should not demote the result to a "derived work".
> However, if so many patches are applied that the result cannot be meaningfully
> correlated with a specific Apache Hadoop release, then it probably has
> become a derived work.

This is one reason why I think the definition of derived work in the
draft of the wiki is way too broad. Something that's nothing like
Hadoop at all but includes a Hadoop jar is given the same label as
something with a single security patch. I think we can come up with a
more useful definition of derived work. If we do that would help us
draw the distinction between:
1. An Apache Hadoop release voted on the PMC, bit-for-bit identical
2. An Apache Hadoop release + backports (eg say per the above
definition of backport)
3. Something that is powered by Hadoop (eg HBase)
4. Something that is not Hadoop nor powered by Hadoop (eg the way tc
Server is not powered by Apache Tomcat)

Note that the current document does not make an exception for security
patches. I and Owen made this suggestion on this thread but the
writeup we are voting on makes no such exception.

> But how do we draw a meaningful line across that big gray area?  That's why I'd like
> see specific text from one of the other projects you cited as an example.

Googling didn't turn up anything in their public archives. This was in
an email exchange I had with Shane several years ago. Hopefully their
PMC can chime in.


View raw message