hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3674) dynamic heartbeat interval for the locality-aware task scheduling
Date Tue, 01 Jul 2008 08:31:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609490#action_12609490

Amar Kamat commented on HADOOP-3674:

Generally we try to make sure that we dont waste any compute cycle of tasktrackers. If this
is a big performance hit then we might need to rethink on this. Although we can bias the decision
of what to give based on various parameters. Check HADOOP-2812 and HADOOP-2014 that are somewhat
related. Let us know why you feel that not having a greedy approach works better here. 

> dynamic heartbeat interval for the locality-aware task scheduling
> -----------------------------------------------------------------
>                 Key: HADOOP-3674
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3674
>             Project: Hadoop Core
>          Issue Type: Wish
>          Components: mapred
>            Reporter: Leitao Guo
>            Priority: Minor
> In current hadoop release (0.17.0), there is no special scheduling policy for those tasktrackers
who have no data for some jobs. So, there would be inefficient in some senarios. For example,
tasktracker A has the data for a job, but tasktracker B, which has no data for this job, sends
the heartbeat message to the jobtracker for a new task before tasktrack A. The task may be
scheduled to B instead of A. While Jobtracker has to find a new task for tasktracker A when
A ask for a new task. 
> In this situation, if jobtracker has some reservation policy, such as reserve the task
for tasktracker A and let B ask for new task in the next heartbeat message, that would be
more efficient. Because before tasktracker B asking for new task the second time, tasktracker
A has applied for a new task and jobtracker has scheduled the task to A.
> Here is a rough idea to deal with the senario above:
> (1) Jobtracker receives the heartbeat message sent by tasktracker B, which has no data
for any job.
> (2) Jobtracker send response message to tasktracker B with a new heartbeat message interval,
but does not schedule new task to B.  The new heartbeat interval should be shorter the current
heartbeat interval, for example, current_heartbeat_interval/2.
> (3) Tasktracker B receive the response from jobtracker, and sends another heartbeat message
for a new task after a period of current_heartbeat_interval/2 .
> (4) Jobtracker then find a new task for tasktracker B.
> This is just an primary idea for the improvement of the locality-aware scheduling. Any
comments are welcome.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message