Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 66609 invoked from network); 7 Feb 2008 06:59:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 7 Feb 2008 06:59:30 -0000 Received: (qmail 89883 invoked by uid 500); 7 Feb 2008 06:59:21 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 89855 invoked by uid 500); 7 Feb 2008 06:59:21 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 89846 invoked by uid 99); 7 Feb 2008 06:59:21 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Feb 2008 22:59:21 -0800 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Feb 2008 06:59:00 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 38F00714073 for ; Wed, 6 Feb 2008 22:59:08 -0800 (PST) Message-ID: <21842905.1202367548230.JavaMail.jira@brutus> Date: Wed, 6 Feb 2008 22:59:08 -0800 (PST) From: "Amar Kamat (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Commented: (HADOOP-2790) TaskInProgress.hasSpeculativeTask is very inefficient In-Reply-To: <14988924.1202352789631.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566474#action_12566474 ] Amar Kamat commented on HADOOP-2790: ------------------------------------ bq. Also, another obvious optimization is to check whether the speculative execution flag is true up front. Even I noticed that few days back. But I thought HADOOP-2141 might fix that. ---- With HADOOP-2119, the calls to {{hasSpeculative()}} might reduce since we are optimizing the look-ups for finding the higher priority runnable tasks and totally avoiding speculative ones in these look-ups. So the check for speculative tasks will be done only if we have nothing else to run. But +1 to do it better than making all the checks all the time. Following are the parameters used for deciding {{TaskInProgress.hasSpeculative()}} : - activeTasks.size() <= MAX_TASK_EXECS _[seems ok]_ - runSpeculative _[should be done earlier, but ok]_ - averageProgress - progress >= SPECULATIVE_GAP _[seems ok]_ - System.currentTimeMillis() - startTime >= SPECULATIVE_LAG : This could be checked once in {{TaskInProgress.recomputeProgress()}} and a check will only be done in {{hasSpeculative()}} if the earlier check resulted as {{false}}. I guess we can still do better but my guess is that we cant totally avoid {{System.currentTimeMillis()}} in {{TaskInProgress.hasSpeculative()}}, no? - completes == 0 _[ok]_ - !isOnlyCommitPending() : May be a Map for _COMMIT_PENDING_ tasks can be maintained in _TaskInProgress_ and the only check made is {{commitPendingStatuses.size() > 0 && commitPendingStatuses.contains(taskId)}}. The space requirement will be same with a re-arrangement to be done in {{TaskInProgress.recomputeProgress()}}. ---- Comments? > TaskInProgress.hasSpeculativeTask is very inefficient > ----------------------------------------------------- > > Key: HADOOP-2790 > URL: https://issues.apache.org/jira/browse/HADOOP-2790 > Project: Hadoop Core > Issue Type: Bug > Components: mapred > Reporter: Owen O'Malley > Fix For: 0.16.1 > > > Each call to JobInProgress.findNewTask can call TaskInProgress.hasSpeculativeTask once per a task. Each call to hasSpeculativeTask calls System.getCurrentTimeMillis, which can result in hundreds of thousands of calls to getCurrentTimeMillis. Additionally, it calls TaskInProgress.isOnlyCommitPending, which calls .values() on the map from task id to host name and iterates through them to see if any of the tasks are in commit pending. It would be better to have a commit pending boolean flag in the TaskInProgress. It also looks like there are other opportunities here, but those jumped out at me. We should also look at this method in the profiler. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.