hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vivek Ratan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4053) Schedulers need to know when a job has completed
Date Mon, 06 Oct 2008 16:07:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637124#action_12637124
] 

Vivek Ratan commented on HADOOP-4053:
-------------------------------------

I had a few questions/comments on _JobStatusChangeEvent_.

- agree with Hemanth that the old JobStatus and new JobStatus should be passed in explicitly.
Otherwise there are hidden dependencies in the calling sequence. 
- It's not clear to me how we're naming the enum values for Events in _JobStatusChangeEvent_.
What does RUN_STATE mean? Does it mean an event that cases a Job's run state to change? If
so, do you mean the job was in a running state and changed to something else or that its state
changed to a running state. I see the same enum value used for both. In CapacityScheduler.getTaskFromQueue(),
you add a RUN_STATE event when the job's state changes from PREP to RUNNING. In JobTracker.finalizeJob(),
you add a RUN_STATE event when the job's state changes from RUNNING To something else. I think
you need to use separate events and name the events a little more consistently. Or else, just
rename the enum to STATE_CHANGE, which can be used for any state change. This should be OK,
given that you have an old and new job status and can figure out how the state changed. In
general, the enum values should be verbs: FINISH_TIME_CHANGED , rather than FINISH_TIME. 
- I don't feel very comfortable with the fact that  _JobStatusChangeEvent_ can contain multiple
Events? I see that the only use case is in the job recovery, when more than one attribute
of a job status has changed. But, abstractly, having a single _JobStatusChangeEvent_ object
handle multiple events is not intuitive. Each event changes the job status. Since _JobStatusChangeEvent_
only tracks a single pair of old and new JobStatus objects, what you're really saying is that
you can add events as long as each one independently changes the job status without affecting
the other events. What prevents a user, for example, from adding two RUN_STATE events? Each
one changes the job status, but you can only keep track of two of them. I think conceptually,
a _JobStatusChangeEvent_ object should map to a single event change, which in turn maps to
a single pair of JobStatus objects. That's much cleaner. During the normal running of the
JobTracker, you only create a _JobStatusChangeEvent_ object for a single event. It's only
in that one use case for recovering jobs where you apply multiple changes to a job status,
and i think it's ok to call updateJobListeners() multiple times. Otherwise, you muddle up
the semantics of a _JobStatusChangeEvent_ object. 



> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, HADOOP-4053-v3.1.patch,
HADOOP-4053-v3.2.patch
>
>
> The JobInProgressListener interface is used by the framework to notify Schedulers of
when jobs are added, removed, or updated. Right now, there is no way for the Scheduler to
know that a job has completed. jobRemoved() is called when a job is retired, which can happen
many hours after a job is actually completed. jobUpdated() is called when a job's priority
is changed. We need to notify a listener when a job has completed (either successfully, or
has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message