hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@yahoo-inc.com>
Subject Re: bringing the codebases back in line
Date Fri, 22 Oct 2010 00:30:22 GMT

On Oct 21, 2010, at 4:50 PM, Ian Holsman wrote:

> but the other question I have which hopefully you guys can answer is  
> does
> the yahoo distribution have ALL the patches from the trunk on it?  
> because if
> it doesn't I think that is problematic as well for other reasons.

Yahoo put security on Apache Hadoop-0.20.

Apache Hadoop trunk is very far from hadoop-0.20, there are lots of  
features in trunk which aren't part of yahoo-hadoop-0.20 simply  
because there wasn't a need or it wasn't worth our effort to backport  
them etc. I know, since I have a big hand in deciding it.

However, we have been very religious about porting all our changes to  
trunk, we might have missed a couple due to time pressure, human  
mistake etc.

Thus, it isn't feasible for yahoo distribution to be a superset of  
trunk. Even more because it takes a *huge* amount of effort to qualify  
trunk... we at Yahoo qualified Apache Hadoop 0.20 and have stuck with  
it for over a year now, same as Cloudera, Facebook etc. Again, I'll  
point out that we have been very good at porting nearly 4000 internal  
commits to trunk throughout this time.

Hope that helps.


View raw message