hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN
Date Wed, 17 Oct 2012 05:20:10 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477617#comment-13477617
] 

Alejandro Abdelnur commented on MAPREDUCE-4495:
-----------------------------------------------

Bobby, thanks for taking the time to go over the doc/code, following some answers/comments
to your feedback.

On *How will the WFAM handle itself crashing…*: an important part of the design/implementation
is that all WF state changes can stored in consistent way after every signal call to the WF.
This enables dumping the updated state to a file in the form of an edit log; this file can
be in HDFS. On restart the WFAM would read the files from HDFS and reconstruct its state (this
is not implemented yet, but it is quite similar on how we do things today in Oozie using the
DB).

On *If the WFAM does restart after a crash will it try to reestablish communication with App
Masters*, the WFAM would be able to reconnect with children AMs without any issue as they
would have continued working without knowing that the parent WFAM got restarted, it would
use just their async client APIs.

On *How will the WFAM schedule containers*, the original idea is to do a passthrough to the
RM, later this proxy may become more sophisticated and have an heuristic to reuse containers
when it makes sense. This would be possible when the WFAM is using embedded AMs (ie an MRAM
to run an MR job) and the embedded AMs support injection of Container implementations (ie
to replace the default container allocation)

On *How do you decided which AM etc has a higher priority…*, we are constrained by a DAG,
thus current DAG nodes get to run before upcoming ones.

On *How security going to be handled?*, no different from how is handled in MRAM.

On *I'm also curious about how you would see us getting to what I was talking about previously..*,
agree with the direction/approach you are proposing in your previous comment. Furthermore,
I'm currently prototyping, after Arun's suggestion, a JobControl subclass that converts the
job dependency tree into a workflow XML (which it would be then executed by the WFAM or submitted
to Oozie). If the prototype works as expected I'm planning to open a JIRA introducing a JobControlFactory
which would return the current JobControl as default but via a configuration property could
instead return a WFJobControl implementation based on the prototype I've just described. Then
changing Pig to use the JobControlFactory to create the JobControl instead a constructor would
give the flexibility of executing Pig in an WFAM or in a current version of Oozie.

Finally, on your comment on dynamic workflow generation, it is definitely doable today using
the WorkflowLib API directly, I could put together a example early next week.

                
> Workflow Application Master in YARN
> -----------------------------------
>
>                 Key: MAPREDUCE-4495
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 2.0.0-alpha
>            Reporter: Bo Wang
>            Assignee: Bo Wang
>         Attachments: MAPREDUCE-4495-v1.1.patch, MAPREDUCE-4495-v1.patch, MapReduceWorkflowAM.pdf,
yapp_proposal.txt
>
>
> It is useful to have a workflow application master, which will be capable of running
a DAG of jobs. The workflow client submits a DAG request to the AM and then the AM will manage
the life cycle of this application in terms of requesting the needed resources from the RM,
and starting, monitoring and retrying the application's individual tasks.
> Compared to running Oozie with the current MapReduce Application Master, these are some
of the advantages:
>  - Less number of consumed resources, since only one application master will be spawned
for the whole workflow.
>  - Reuse of resources, since the same resources can be used by multiple consecutive jobs
in the workflow (no need to request/wait for resources for every individual job from the central
RM).
>  - More optimization opportunities in terms of collective resource requests.
>  - Optimization opportunities in terms of rewriting and composing jobs in the workflow
(e.g. pushing down Mappers).
>  - This Application Master can be reused/extended by higher systems like Pig and hive
to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message