hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sreekanth Ramakrishnan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-802) Simplify the job updated event notification between Jobtracker and schedulers
Date Tue, 04 Aug 2009 05:27:14 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738834#action_12738834
] 

Sreekanth Ramakrishnan commented on MAPREDUCE-802:
--------------------------------------------------

Also a part of this issue, I propose to have a single source of {{startTime}} and move the
same to {{JobStatus}}.  As event {{START_TIME_CHANGED}} is raised on {{JobStatus}} so if a
person changes only {{JobInProgress}} {{startTime}} field and forgets to change {{JobStatus.startTime}}
we might run into issue of correct start time not being propagated.


> Simplify the job updated event notification between Jobtracker and schedulers
> -----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-802
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-802
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker
>            Reporter: Hemanth Yamijala
>            Assignee: Sreekanth Ramakrishnan
>
> HADOOP-4053 and HADOOP-4149 added events to take care of updates to the state / property
of a job like the run state / priority of a job notified to the scheduler. We've seen some
issues with this framework, such as the following:
> - Events are not raised correctly at all places. If a new code path is added to kill
a job, raising events is missed out.
> - Events are raised with incorrect event data. For e.g. typically start time value is
missed out.
> The resulting contract break between jobtracker and schedulers has lead to problems in
the capacity scheduler where jobs remain stuck in the queue without being ever removed and
so on.
> It has proven complicated to get this right in the framework and fixes have typically
still left dangling cases. Or new code paths introduce new bugs.
> This JIRA is about trying to simplify the interaction model so that it is more robust
and works well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message