hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Holsman <had...@holsman.net>
Subject Re: What is "Hadoop?" Was: Defining Hadoop Compatibility -revisiting-
Date Fri, 13 May 2011 23:17:11 GMT

On May 14, 2011, at 12:41 AM, Owen O'Malley wrote:

> On Tue, May 10, 2011 at 3:29 AM, Steve Loughran <stevel@apache.org> wrote:
>> I think we should revisit this issue before people with their own agendas
>> define what compatibility with Apache Hadoop is for us
> I agree completely. As you point out, this week we've had a flood of
> products calling themselves "Hadoop" or "Distribution of Hadoop" that
> include only a part of Hadoop. This is will dilute Apache's Hadoop trademark
> and create consumer confusion.
> Licensing
>> -Use of the Hadoop codebase must follow the Apache License
>> http://www.apache.org/licenses/LICENSE-2.0
>> -plug in components that are dynamically linked to (Filesystems and
>> schedulers) don't appear to be derivative works on my reading of this,
> +1
> Plugins are usually considered independent works. Note that the Apache
> license does permit commercial closed-source derivative works. A company
> could take Hadoop's code, modify it, and sell a binary release as long as
> they meet the conditions of the Apache license.
>> Naming
>> -this is something for branding@apache, they will have their opinions.
>> The key one is that the name "Apache Hadoop" must get used, and it's
>> important to make clear it is a derivative work.
>> -I don't think you can claim to have a Distribution/Fork/Version of Apache
>> Hadoop if you swap out big chunks of it for alternate filesystems, MR
>> engines, etc. Some description of this is needed
>> "Supports the Apache Hadoop MapReduce engine on top of Filesystem XYZ"
> The Hadoop name is the primary tool that the project has for minimizing
> customer confusion. I think we need to create a very clear definition of
> what can be called Hadoop and what can not. Apache gives the PMCs a fair
> amount of latitude in picking the policy for their project name and I think
> we need to do so.
> Given the large number of so-called Hadoop products that are being released,
> I believe that we should require "Hadoop" to mean specifically the Apache
> Hadoop releases (possibly with a few critical security patches).
> Projects that are derivative works can either be "powered by Apache Hadoop,"
> or "based on Apache Hadoop."
> What do others think?
I think thats a great idea. 
Maybe we should also create names/marks around the interfaces as well.

> -- Owen

View raw message