hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom White <...@cloudera.com>
Subject Re: [DISCUSS] Hadoop Security Release off Yahoo! patchset
Date Wed, 25 Aug 2010 17:16:15 GMT
Hi Arun,

I think it would be good to have a shared 0.20 Apache security branch.
Since security isn't in 0.21, and the 0.22 release is a some way off
as you mention, this would be useful for folks who want the security
features sooner (and want to use an Apache release).


On Mon, Aug 23, 2010 at 5:27 PM, Arun C Murthy <acm@yahoo-inc.com> wrote:
> Even with the work on hadoop-0.22 (trunk) starting in earnest it is fairly
> obvious, given our past history, that it will take a while for us to get it
> stable and deployable - for e.g. it took us nearly 6 months to deploy
> hadoop-0.20.
> In the interim I'd like to propose we push a hadoop-0.20-security release
> off the Yahoo! patchset (http://github.com/yahoo/hadoop-common). This will
> ensure the community benefits from all the work done at Yahoo! for over 12
> months *now*, and ensures that we do not have to wait until hadoop-0.22
> which has all of these patches.
> Some salient aspects:
> a) Full-fledged security implementation deployed at scale (4000 nodes) in
> production.
> b) Lots of work on the stabilizing and optimizing the NameNode and
> JobTracker for over 12 months. This has been critical in deploying Hadoop at
> scale i.e. clusters of 4000 nodes. For e.g. we have a 50% improvement in CPU
> utilization on the JobTracker vis-a-vis the hadoop-0.20.2 release.
> c) Several new features in the scheduler (CapacityScheduler), Map-Reduce
> framework, better support for multi-tenancy etc.
> d) Several performance and stability improvements to the system e.g.
> iterative ls, robustness against rogue clients/jobs/users etc.
> Also, given the huge number of features and enhancements I'd like to propose
> we create a new 0.20-security branch and commit the Yahoo patchset there for
> the release.
> This has been proposed earlier by Doug and did not get far due to concerns
> about the effect this would have on development on trunk. However, I
> believe, we have a case for demonstrable progress on trunk now, and it would
> be useful to have an interim, fully-tested Apache Hadoop release available
> to the community.
>  Conceivably, one could imagine a Hadoop Security + Append release soon
> after. At this point a Hadoop Security release alone would add tremendous
> value for the reasons above. Presently we would like to get this release out
> quickly to focus the majority of our efforts on trunk.
> Thoughts?
> Arun

View raw message