hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-4053) Schedulers need to know when a job has completed
Date Wed, 01 Oct 2008 11:15:44 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Attachment: HADOOP-4053-v3.1.patch

Attaching a patch that implements the {{JobChangeEvent}} concept. Here is how it is implemented.

_Assumptions :_
Everything that has the potential to change a job's state is captured and bundled under {{JobStatus}}.
Hence taking snapshot of job's status before and after the event should be sufficient determine
the state change.

_Working :_
1) {{JobInProgressListener.jobUpdated()}} now takes {{JobChangeEvent}} as a parameter.

2) {{JobChangeEvent}} is an abstract class that has just one api, {{getJobInProgress()}}.

3) For the task at hand, i.e handling _priority-change_, _start-time-change_ and _job-runstate-change_,
I have extended {{JobChangeEvent}} to {{JobStatusChangeEvent}}. 

4) {{JobStateChangeEvent}} hosts a set of _sub-events_ that can lead to job-status change.
These are fields from {{JobStatus}} that has a potential to change for a given job. Some of
them are _priority, start-time, run-state_ etc. While composing an event, one can specify
what all _sub-events_ constitute the state change. Note that the order in which the _sub-events_
are specified is also preserved.

5) For capacity-scheduler,  based on the _sub-events_ constituting the state transition, appropriate
action is performed. For now the actions are
    - promote a job from the waiting queue to the running queue
    - remove a job upon job completion
    - re-position the job in the queue as the parameters that decide where the job is positioned
has changed

6) If {{JobStateChangeEvent}} fails to capture all the events then {{JobChangeEvent}} can
be extended to cater that case.

7) Other listener implementations remain unchanged as they just require {{jobInProgress}}
which is obtained from {{JobChangeEvent}}.

Tested the patch with capacity scheduler and it works fine. The web-ui doesnt show completed
jobs in the job queue which means that the job is removed upon completion. _test-patch_ and
_ant test_ pass on my box. Rest of the listener implementations should not be affected.
This patch is meant for 0.19.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of
when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to
know that a job has completed. jobRemoved() is called when a job is retired, which can happen
many hours after a job is actually completed. jobUpdated() is called when a job's priority
is changed. We need to notify a listener when a job has completed (either successfully, or
has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message