Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-dev@lucene.apache.org
Message-ID: <27777180.1192704050866.JavaMail.jira@brutus>
Date: Thu, 18 Oct 2007 03:40:50 -0700 (PDT)
From: "Arun C Murthy (JIRA)" <jira@apache.org>
To: hadoop-dev@lucene.apache.org
Subject: [jira] Commented: (HADOOP-1900) the heartbeat and task event
 queries interval should be set dynamically by the JobTracker
In-Reply-To: <30973750.1189796072225.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HADOOP-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535879 ] 

Arun C Murthy commented on HADOOP-1900:
---------------------------------------

bq. I wonder if instead we should just make it clusterSize/50+1? That way, small clusters will get a heartbeat of just one second, which should make them more responsive.

+1

I'd like to see some numbers about how long it takes to process a heartbeat etc. before we decide on the actual scaling factors (both up and down). Given that we've run so far on clusters of 2000 nodes with heartbeat-interval of 10s, I'd suspect scaling it up by 10s for every 500 nodes is too conservative... anyway I'll believe the numbers when we have them.

Also, while we are at this, I say we should start to consider *busy-ness* of JobTracker too, along with the cluster-size. So, for e.g., if the individual tasks are taking in the order of minutes, then it might not matter much if we send one every 20s or so, in some cases it might. I know that the sort's map tasks take around 40s each... 

So, one way to take this into account might be to maintain an average time-to-complete for all tasks in the system (of current jobs) and factor that into the scaling of the intervals.


> the heartbeat and task event queries interval should be set dynamically by the JobTracker
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1900
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1900
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sri Ramadasu
>
> The JobTracker should scale the intervals that the TaskTrackers use to contact it dynamically, based on how the busy it is and the size of the cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.