hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Collins <...@cloudera.com>
Subject Re: [VOTE] Shall we adopt the "Defining Hadoop" page
Date Thu, 16 Jun 2011 01:02:53 GMT
On Wed, Jun 15, 2011 at 10:44 AM, Matthew Foley <mattf@yahoo-inc.com> wrote:
> Eli, you said:
>> Putting a build of Hadoop that has 4 security patches applied into the same
>> category as a product that has entirely re-worked the code and not
>> gotten it checked into trunk does a major disservice to the people who
>> contribute to and invest in the project.
>
> How would you phrase the distinction, so that it is clear and reasonably unambiguous
> for people who are not Hadoop developers?  Do the HTTP and Subversion policies
> draw this distinction, and if so could you please point us at the specific text, or
> copy that text to this thread?
>

I'll try to find it, this was told to me verbally a while back. Maybe
Roy can chime in.

Since there seems to be some confusion around distribution we should
make this explicit.  Some people are currently interpreting the
guidelines to say that if you patch an Apache Hadoop release yourself
then you're still running Apache Hadoop.  But if a vendor patches
Apache Hadoop for you then you're not running Apache Hadoop. How about
if a subcontractor patches Apache Hadoop for you, then is it Apache
Hadoop? This isn't sustainable.

Thanks,
Eli
> Thanks,
> --Matt
>
>
> On Jun 15, 2011, at 9:40 AM, Eli Collins wrote:
>
> On Tue, Jun 14, 2011 at 7:45 PM, Owen O'Malley <omalley@apache.org> wrote:
>>
>> On Jun 14, 2011, at 5:48 PM, Eli Collins wrote:
>>
>>> Wrt derivative works, it's not clear from the document, but I think we
>>> should explicitly adopt the policy of HTTPD and Subversion that
>>> backported patches from trunk and security fixes are permitted.
>>
>> Actually, the document is extremely clear that only Apache releases may be called
Hadoop.
>>
>> There was a very long thread about why the rapidly expanding Hadoop-ecosystem is
leading to at lot of customer confusion about the different "versions" of Hadoop. We as the
Hadoop project don't have the resources or the necessary compatibility test suite to test
compatibility between the different sets of cherry picked patches. We also don't have time
to ensure that all of the 1,000's of patches applied to 0.20.2 in each of the many (10? 15?)
different versions have been committed to trunk. Futhermore, under the Apache license, a company
Foo could claim that it is a cherry pick version of Hadoop without releasing their source
code that would enable verification.
>>
>> In summary,
>>  1. Hadoop is very successful.
>>  2. There are many different commercial products that are trying to use the Hadoop
name.
>>  3. We can't check or enforce that the cherry pick versions are following the rules.
>>  4. We don't have a TCK like Java does to validate new versions are compatible.
>>  5. By far the most fair way to ensure compatibility and fairness between companies
is that only Apache Hadoop releases may be called Hadoop.
>>
>> That said, a package that includes a small number (< 3) of security patches that
haven't been released yet doesn't seem unreasonable.
>>
>
> I've spoken with ops teams at many companies,  I am not aware of
> anyone who runs an official release (with just 2 security patches). By
> this definition many of the most valuable contributors to Hadoop,
> including Yahoo!, Cloudera, Facebook, etc are not using Hadoop.  Is
> that really the message we want to send? We expect the PMC to enforce
> this equally across all parties?
>
> It's a fact of life that companies and ops teams that support Hadoop
> need to patch the software before the PMC has time and/or will to vote
> on new releases. This is why HTTP and Subversion allow this. Putting a
> build of Hadoop that has 4 security patches applied into the same
> category as a product that has entirely re-worked the code and not
> gotten it checked into trunk does a major disservice to the people who
> contribute to and invest in the project.
>
> Thanks,
> Eli
>
>

Mime
View raw message