hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-279) Map-Reduce 2.0
Date Thu, 17 Mar 2011 01:22:32 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Arun C Murthy updated MAPREDUCE-279:

    Attachment: MR-279_MR_files_to_move.txt

Folks, we are happy to put out a first cut of MRv2.

A brief overview:

A global ResourceManager (RM) tracks machine availability and scheduling invariants while
a per-application ApplicationMaster (AM) runs inside the cluster and tracks the program semantics
for a given job. An application is either a single MapReduce job as the JobTracker supports
today, it could be a directed, acyclic graph (DAG) of MapReduce jobs, or it could be a new
framework. Each machine in the cluster runs a per-node daemon, the NodeManager (NM), responsible
for enforcing and reporting the resource allocations made by the RM and monitoring the lifecycle
of processes spawned on behalf of an application. Each process started by the NM is conceptually
a container, or a bundle of resources allocated by the RM.

We call the new framework (RM/NM) as YARN (Yet Another Resource Negotiator)... ;-)

Source layout:

# A new yarn source folder contains the RM and NM.
# A new mr-client folder contains all of the MapReduce runtime. This includes the MapReduce
ApplicationMaster and all of the classes for running MapReduce applications. Please note that
the MR runtime has not changed at all, including the user apis - we continue to support both
the old 'mapred' api and the new 'mapreduce' api (context-objects). We are moving some classes
from src/java/mapred/* to mr-client to achieve the same.
# We have continued to keep the old JobTracker/TaskTracker based MapReduce framework in src/java.

# We decided to embrace maven for MRv2, hence yarn and mr-client are built via maven.
# For now the old JT/TT based MR framework continues to use ant/ivy. Hopefully we can change
this soon - I know Giri is working on this for common, hdfs and mapreduce at one go.

There is a INSTALL file which describes how to build, deploy MRv2 and also how to run MR applications.


I'm planning on committing this patch to a development branch (named MAPREDUCE-279) soon so
that we can continue all our work via Apache in the open. We *really* look forward to feedback
and working with the community henceforth. We have many many miles to go and promises to keep!

PS: I have attached a script (MR-279.sh) to show the the files being moved to mr-client for
the MR runtime, a list of files being moved and the actual patch to apply after. Also, please
note that the patch is significantly bigger than it should be since it includes binary images
(via git diff --text).

> Map-Reduce 2.0
> --------------
>                 Key: MAPREDUCE-279
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker, tasktracker
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>             Fix For: 0.23.0
>         Attachments: MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt
> Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component
that manages the application execution. 

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message