hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Milind.Bhandar...@emc.com>
Subject Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?
Date Fri, 07 Oct 2011 16:23:41 GMT

>you can improve Hadoop to make it more agile; my defunct Hadoop
>lifecycle branch did a lot of that, but you have to have everyone else
>using Hadoop to be willing to let the changes go in -and those changes
>mustn't impose a cost or risk to the physical cluster model.

Until Hadoop 0.20, when Hadoop On Demand (HoD) was in widespread use,
quickly bringing up a mapreduce cluster, and making it go away quickly,
was an explicit goal.

After that, focus shifted to multi-tenancy for MR in hadoop.

When HoD went away, I made a comment on one of the internal mailing list,
that it will make a comeback when Vms become first class citizens of the
hadoop world.

I have heard of several efforts from well-known vendors *wink* to make
this happen.

I have been looking closely at the defunct HoD code to see if it still can
be used, but with the new MRv2 architecture, it looks like that will
require major surgery. We can have the RM allocate containers, and should
be able to run custom MR runtime there (essentially replacing torque in
HoD with RM).

Is this something you had in mind too ?

- milind

Milind Bhandarkar
Greenplum Labs, EMC
(Disclaimer: Opinions expressed in this email are those of the author, and
do not necessarily represent the views of any organization, past or
present, the author might be affiliated with.)

View raw message