hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@yahoo-inc.com>
Subject Re: [DISCUSS] Hadoop Security Release off Yahoo! patchset
Date Fri, 14 Jan 2011 04:23:03 GMT

On Jan 13, 2011, at 6:50 PM, Eli Collins wrote:

> The cdh3 patch set Todd is talking about is not vanilla 104.3, it's
> 104.3 re-based onto 20.2 plus patches from branch-20 and trunk (the
> performance and stability fixes I think you're referring to, at least
> the ones that have been posted to Apache jira).
> Can you post a pointer to the version you're referring to, eg on
> github?  If there isn't a big delta between it and the cdh3 patch set
> (which should have the 20-based patches from jira) perhaps you and
> Todd could easily merge in the delta to create 0.20.x?

I can guarantee it will need work to merge the enhancements since  
20.104.3, it's over 6 months of development. The enhancements includes  
work on stability such as iterative ls, limits on JT to prevent single  
jobs/users from taking it down etc. and lots of bug-fixes to security.  
So, unfortunately the delta is pretty large.

I'm working on a CHANGES.txt which should reflect all the changes i.e.  
bug-fixes and enhancements.

>> The version I'm offering to push to the community has fixed all of  
>> them,
>> *plus* the added benefit of several stability and performance fixes  
>> we have
>> done since 20.104.3, almost 10 internal releases. This is a battle  
>> tested
>> and hardened version which we have deployed on 40,000+ nodes. It is a
>> significant upgrade on which we never deployed. I'm  
>> pretty sure
>> *some* users will find that valuable. ;)
> Definitely, but better to hit two birds with one stone right?  Instead
> of a security + enhancements release and an append release we could
> have a single security + append + enhancements release and users don't
> have to choose.

We are discussing two options:
20 + security + enhancements
20 + security + append

I think the value we provide via 20+security+enhancements release is  
that it's stable, tested and deployed at scale. Doing any more work  
merging 6 months of work at Yahoo (again, I guarantee it's a lot of  
work) will need a lots of cycles to validate, test and stabilize.

I feel the alternative is a distraction for me, I'd rather work on 0.22.

I can get 20+security+enhancements done very, very, quickly precisely  
because I don't have to spend cycles testing it.

Does that make sense? Thanks for being patient and bearing with me...


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message