Return-Path: Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: (qmail 50392 invoked from network); 14 Oct 2010 09:46:02 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 14 Oct 2010 09:46:02 -0000 Received: (qmail 6833 invoked by uid 500); 14 Oct 2010 09:46:02 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 6747 invoked by uid 500); 14 Oct 2010 09:45:59 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 6736 invoked by uid 99); 14 Oct 2010 09:45:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Oct 2010 09:45:58 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Oct 2010 09:45:56 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o9E9jYeP005308 for ; Thu, 14 Oct 2010 09:45:34 GMT Message-ID: <3158164.139381287049534361.JavaMail.jira@thor> Date: Thu, 14 Oct 2010 05:45:34 -0400 (EDT) From: "luoli (JIRA)" To: mapreduce-issues@hadoop.apache.org Subject: [jira] Commented: (MAPREDUCE-2116) optimize getTasksToKill to reduce JobTracker contention In-Reply-To: <9109189.21821286438492500.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920905#action_12920905 ] luoli commented on MAPREDUCE-2116: ---------------------------------- bq. Joydeep has similar idea. But he created another class that keeps a pair. We are testing this internally. en~, that can work too. And now we found the reason why the taskStatuses.get in shouldClose() consume so much time. That's because in getTasksToKill(), the shouldClose() got called so many times. This is because for every TaskAttemptID in taskIds which got from Set taskIds = trackerToTaskMap.get(taskTracker); call, the taskIds will contains hundreds of entry if the tasktracker have finished lots of task attempts and those task's job all have not finished finally. So all those task attempts are contained in trackerToTaskMap's value set. In this case, even lots of tasks attempt have finished in the tasktracker, it gets iterated in every heartbeat of the tasktracker before the jobs which they belong to have finished. This is not good. > optimize getTasksToKill to reduce JobTracker contention > ------------------------------------------------------- > > Key: MAPREDUCE-2116 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2116 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobtracker > Reporter: Joydeep Sen Sarma > Attachments: 2116.1.patch, getTaskToKill.JPG > > > getTasksToKill shows up as one of the top routines holding the JT lock. Specifically, the translation from attemptid to tip is very expensive: > at java.util.TreeMap.getEntry(TreeMap.java:328) > at java.util.TreeMap.get(TreeMap.java:255) > at org.apache.hadoop.mapred.TaskInProgress.shouldClose(TaskInProgress.java:500) > at org.apache.hadoop.mapred.JobTracker.getTasksToKill(JobTracker.java:3464) > locked <0x00002aab6ebb6640> (a org.apache.hadoop.mapred.JobTracker) > at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3181) > this seems like an avoidable expense since the tip for a given attempt is fixed (and one should not need a map lookup to find the association). on a different note - not clear to me why TreeMaps are in use here (i didn't find any iteration over these maps). any background info on why things are arranged the way they are would be useful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.