hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard Chen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-2172) Suspend/Resume Hadoop Jobs
Date Tue, 17 Jun 2014 15:08:04 GMT

     [ https://issues.apache.org/jira/browse/YARN-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Richard Chen updated YARN-2172:
-------------------------------

    Description: 
In a multi-application cluster environment, jobs running inside Hadoop YARN may be of lower-priority
than jobs running outside Hadoop YARN like HBase. To give way to other higher-priority jobs
inside Hadoop, a user or some cluster-level resource scheduling service should be able to
suspend and/or resume some particular jobs within Hadoop YARN.

When target jobs inside Hadoop are suspended, those already allocated and running task containers
will continue to run until their completion or active preemption by other ways. But no more
new containers would be allocated to the target jobs. In contrast, when suspended jobs are
put into resume mode, they will continue to run from the previous job progress and have new
task containers allocated to complete the rest of the jobs.

My team has completed its implementation and our tests showed it works in a rather solid way.


  was:
In a multi-application cluster environment, jobs running inside Hadoop YARN may be of lower-priority
than jobs running outside Hadoop YARN like HBase. To give way to other higher-priority jobs
inside Hadoop, a user or some cluster-level resource scheduling service should be able to
suspend and/or resume some particular jobs within Hadoop YARN.

When target jobs inside Hadoop are suspended, those already allocated and running task containers
will continue to run until their completion or active preemption by other ways. But no more
new containers would be allocated to the target jobs. In contrast, when suspended jobs are
put into resume mode, they will continue to run from the previous job progress and have new
task containers allocated to complete the rest of the jobs.

My team has completed its implementation and our tests showed it is working in a rather solid
way. 


> Suspend/Resume Hadoop Jobs
> --------------------------
>
>                 Key: YARN-2172
>                 URL: https://issues.apache.org/jira/browse/YARN-2172
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: resourcemanager, webapp
>    Affects Versions: 2.2.0
>         Environment: CentOS 6.5, Hadoop 2.2.0
>            Reporter: Richard Chen
>              Labels: hadoop, jobs, resume, suspend
>             Fix For: 2.2.0
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> In a multi-application cluster environment, jobs running inside Hadoop YARN may be of
lower-priority than jobs running outside Hadoop YARN like HBase. To give way to other higher-priority
jobs inside Hadoop, a user or some cluster-level resource scheduling service should be able
to suspend and/or resume some particular jobs within Hadoop YARN.
> When target jobs inside Hadoop are suspended, those already allocated and running task
containers will continue to run until their completion or active preemption by other ways.
But no more new containers would be allocated to the target jobs. In contrast, when suspended
jobs are put into resume mode, they will continue to run from the previous job progress and
have new task containers allocated to complete the rest of the jobs.
> My team has completed its implementation and our tests showed it works in a rather solid
way. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message