hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hemanth Yamijala <yhema...@gmail.com>
Subject Re: [DISCUSS] Hadoop Security Release off Yahoo! patchset
Date Wed, 25 Aug 2010 17:46:43 GMT

How much time do you think it would take to have a version of 0.20
with the security features in it ready ? In a different thread, Owen
has started discussing plans around 0.22. Do you think this effort
would affect 0.22 release ?

I do agree that this would be very useful for folks who want security
sooner. And the fact that Yahoo! have been running it at scale for a
good while now is also assuring.


On Tue, Aug 24, 2010 at 5:57 AM, Arun C Murthy <acm@yahoo-inc.com> wrote:
> Even with the work on hadoop-0.22 (trunk) starting in earnest it is fairly
> obvious, given our past history, that it will take a while for us to get it
> stable and deployable - for e.g. it took us nearly 6 months to deploy
> hadoop-0.20.
> In the interim I'd like to propose we push a hadoop-0.20-security release
> off the Yahoo! patchset (http://github.com/yahoo/hadoop-common). This will
> ensure the community benefits from all the work done at Yahoo! for over 12
> months *now*, and ensures that we do not have to wait until hadoop-0.22
> which has all of these patches.
> Some salient aspects:
> a) Full-fledged security implementation deployed at scale (4000 nodes) in
> production.
> b) Lots of work on the stabilizing and optimizing the NameNode and
> JobTracker for over 12 months. This has been critical in deploying Hadoop at
> scale i.e. clusters of 4000 nodes. For e.g. we have a 50% improvement in CPU
> utilization on the JobTracker vis-a-vis the hadoop-0.20.2 release.
> c) Several new features in the scheduler (CapacityScheduler), Map-Reduce
> framework, better support for multi-tenancy etc.
> d) Several performance and stability improvements to the system e.g.
> iterative ls, robustness against rogue clients/jobs/users etc.
> Also, given the huge number of features and enhancements I'd like to propose
> we create a new 0.20-security branch and commit the Yahoo patchset there for
> the release.
> This has been proposed earlier by Doug and did not get far due to concerns
> about the effect this would have on development on trunk. However, I
> believe, we have a case for demonstrable progress on trunk now, and it would
> be useful to have an interim, fully-tested Apache Hadoop release available
> to the community.
>  Conceivably, one could imagine a Hadoop Security + Append release soon
> after. At this point a Hadoop Security release alone would add tremendous
> value for the reasons above. Presently we would like to get this release out
> quickly to focus the majority of our efforts on trunk.
> Thoughts?
> Arun

View raw message