hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sri Ramadasu (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1900) the heartbeat and task event queries interval should be set dynamically by the JobTracker
Date Wed, 31 Oct 2007 12:36:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539067

Amareshwari Sri Ramadasu commented on HADOOP-1900:

In the patch attached,
The job tracker periodically calculates the heartbeat interval. It looks at both cluster size
and busyness of jobtracker. If jobtracker is busy, the interval is incremented by a busyFactor.
If it is not busy for two continuous periods, the interval is decremented by the busyFactor.

Map events polling interval is calculated as a function of heartbeat interval to skip the
recalculation. It is calculated as follows:
polling_interval = heartbeat_interval/3;
if polling_interval < MIN_POLLING_INTERVAL, then polling_interval = MIN_POLLING_INTERVAL;
if polling_interval > MAX_POLLING_INTERVAL, then polling_interval = MAX_POLLING_INTERVAL;
MapEventsFetcherThread is notified if a reduce task doesnt find map events at the tasktracker.

bq.I propose a change to the status message in the heartbeat - the tasktracker can compare
the current task status with the previous one and if it finds the status to be the same, it
doesn't send the complete status object to the JobTracker, but just a flag saying it is a
duplicate or something to that effect. That will reduce the data per RPC considerably for
long running tasks whose statuses don't change frequently and also reduce the processing load
on the JobTracker.

This will be addressed in another JIRA

> the heartbeat and task event queries interval should be set dynamically by the JobTracker
> -----------------------------------------------------------------------------------------
>                 Key: HADOOP-1900
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1900
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sri Ramadasu
>         Attachments: patch-1900.txt
> The JobTracker should scale the intervals that the TaskTrackers use to contact it dynamically,
based on how the busy it is and the size of the cluster.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message