hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN
Date Wed, 17 Oct 2012 19:08:06 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478253#comment-13478253

Robert Joseph Evans commented on MAPREDUCE-4495:


I think there is a bit of a disconnect about the version of the design associated with some
of my questions.  Sorry about that I did not specify it on all of my questions. Most of my
questions are about what will happen in the V2/V3 time frame when child AMs will be running
as separate processes in containers that are launched directly by the WF AM not as containers
that are launched and monitored by the RM.  I get that in V1 the WF AM will simply launch
other applications and interact with them through the client API.  But for V2 and V3 when
an AM is launched by the WF AM and not directly by the RM the WF AM must take over some responsibilities
of the RM.  I am curious how many of those responsibilities it will take over.  I am also
curious about what modifications will be required to other AMs so that they can interact with
both the WF AM and also the RM directly.  AM I wrong about this?  Is the design different
from how I understood it and the WF AM will not launch other AMs as separate processes running
in different containers?
> Workflow Application Master in YARN
> -----------------------------------
>                 Key: MAPREDUCE-4495
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 2.0.0-alpha
>            Reporter: Bo Wang
>            Assignee: Bo Wang
>         Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, MapReduceWorkflowAM.pdf,
> It is useful to have a workflow application master, which will be capable of running
a DAG of jobs. The workflow client submits a DAG request to the AM and then the AM will manage
the life cycle of this application in terms of requesting the needed resources from the RM,
and starting, monitoring and retrying the application's individual tasks.
> Compared to running Oozie with the current MapReduce Application Master, these are some
of the advantages:
>  - Less number of consumed resources, since only one application master will be spawned
for the whole workflow.
>  - Reuse of resources, since the same resources can be used by multiple consecutive jobs
in the workflow (no need to request/wait for resources for every individual job from the central
>  - More optimization opportunities in terms of collective resource requests.
>  - Optimization opportunities in terms of rewriting and composing jobs in the workflow
(e.g. pushing down Mappers).
>  - This Application Master can be reused/extended by higher systems like Pig and hive
to provide an optimized way of running their workflows.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message