Return-Path: Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: (qmail 22713 invoked from network); 14 Sep 2010 00:12:17 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 14 Sep 2010 00:12:17 -0000 Received: (qmail 28063 invoked by uid 500); 14 Sep 2010 00:12:14 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 27616 invoked by uid 500); 14 Sep 2010 00:12:14 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 27469 invoked by uid 99); 14 Sep 2010 00:12:14 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Sep 2010 00:12:14 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Sep 2010 00:11:56 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o8E0BZ23007604 for ; Tue, 14 Sep 2010 00:11:35 GMT Message-ID: <4310699.168021284423095286.JavaMail.jira@thor> Date: Mon, 13 Sep 2010 20:11:35 -0400 (EDT) From: "Joydeep Sen Sarma (JIRA)" To: mapreduce-issues@hadoop.apache.org Subject: [jira] Commented: (MAPREDUCE-2062) speculative execution is too aggressive under certain conditions In-Reply-To: <457646.129791284168093497.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909059#action_12909059 ] Joydeep Sen Sarma commented on MAPREDUCE-2062: ---------------------------------------------- another thing we have noticed is that progress rate (especially the reducer's) is usually pretty low (compared to mean) when the task initially starts (which causes lots of false speculations). However - the absolute progress rate of the speculated tasks is not bad at all (most of the speculated tasks had a progress rate that would have taken them to 100% within 3-4 minutes). One heuristic that seemed obvious after looking at this was that we should have a upper bound on the progress rate - where above that progress rate - speculation does not make sense (regardless of mean/stddev). The proposal is to be able to configure this as a 'minimum_duration' setting on mappers/reducers. if the mapper/reducer is projected to finish within this duration - no speculation will be done. setting the duration to a small number like 3-4 minutes would weed out a lot of excessive speculators. > speculative execution is too aggressive under certain conditions > ---------------------------------------------------------------- > > Key: MAPREDUCE-2062 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2062 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker > Environment: hadoop-20 with HADOOP-2141 > Reporter: Joydeep Sen Sarma > > The function canBeSpeculated has subtle bugs that cause too much speculation in certain cases. > - it compares the current progress of the task with the last observed mean of all the tasks. if only one task is in question - then the progress rate decays as time progresses (in the absence of updates) and std-dev is zero. So a job with a single reducer or mapper is almost always speculated. > - is only a single task has reported progress - then the stddev is zero. so other tasks may be speculated aggressively. > - several tasks take a while to report progress initially. they seem to get speculated as soon as speculative-lag is over. the lag should be configurable at the minimum. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.