hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Carey (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
Date Tue, 22 Feb 2011 05:49:38 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12997648#comment-12997648

Scott Carey commented on MAPREDUCE-279:

Consider a 10k node cluster with 25-30 containers per node and 10k running jobs - we'd need
at least 10k * 10k watches which is a lot for ZK

Thanks for the info Arun.  There would be a lot to work out to mix in ZK and not run into
a scalability wall.  
If you assume that each node has to watch every job, its not going to scale.   If each node
is only watching one thing when in need of work ("Is there work for me?") you can get a large
chunk of the RPC that causes delayed task starts gone.  I'm mainly thinking of the "is there
work for me now?  what about now?  And now?" RPC that goes on in hadoop today.  That could
be inverted into "flag three nodes with local data simultaneously that there is work for them,
the first to grab the item wins".  How valuable is replacing just part of the RPC?  I'm not
sure.  It would help my clusters, but they aren't that big.
The other part of the scheduling problem you allude to that requires scanning all available
jobs and assigning resources would need some clever work to do in ZK without scalability problems.

On a related item, I am glad that job submission includes a DAG of tasks.  There is a lot
of opportunity to reduce latency in job flows there and consolidate work from a half-dozen
projects duplicating effort.

It is becoming much harder, not easier to evolve the hadoop code base.
The choice to have all three projects be in their own trunk/tags/branches was a mistake IMO.
 I've done the same elsewhere and learned the hard way:  don't put projects under different
version trees unless you intend to actually completely decouple them *and* release them separately.

Hadoop needs more modularity and plugability, but making Cluster Management and Application
Management plug-able does not depend on separate projects, its the other way around.  Hadoop
needs to become more modular internally, its build more sophisticated, and the build outputs
more flexible.  After a user can swap out foo-resource-manager.jar with hadoop-resource-manager.jar
behind resource-manager-api.jar and expect it to work, a separate project for the hadoop-resource-manager
could make sense.

That said, I agree with Eric's #1 -- future modularity and this work are separate discussions
/ items.  IMO any greater project restructuring related to cluster management depends on this,
and not the other way around.  A project split should not be the enforcer of for modularity,
actual proven modularity should be justification for a split.  If one is afraid that without
a project split, things are bound to be intertwined, other solutions should be found.  Releasing
separate jars for the components is one way to move forward that does not need a project split
-- though it might require Maven to make it easy to manage and make a split much easier.

> Map-Reduce 2.0
> --------------
>                 Key: MAPREDUCE-279
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker, tasktracker
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>             Fix For: 0.23.0
> Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component
that manages the application execution. 

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message