Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AA13CC865 for ; Thu, 18 Jul 2013 17:38:07 +0000 (UTC) Received: (qmail 53091 invoked by uid 500); 18 Jul 2013 17:38:00 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 52867 invoked by uid 500); 18 Jul 2013 17:38:00 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 52847 invoked by uid 99); 18 Jul 2013 17:37:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Jul 2013 17:37:58 +0000 X-ASF-Spam-Status: No, hits=-5.0 required=5.0 tests=RCVD_IN_DNSWL_HI,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [211.189.100.11] (HELO usmailout1.samsung.com) (211.189.100.11) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Jul 2013 17:37:50 +0000 Received: from uscpsbgm2.samsung.com (u115.gpu85.samsung.co.kr [203.254.195.115]) by mailout1.w2.samsung.com (Oracle Communications Messaging Server 7u4-24.01(7.0.4.24.0) 64bit (built Nov 17 2011)) with ESMTP id <0MQ5005VL7LUFL20@mailout1.w2.samsung.com> for user@hadoop.apache.org; Thu, 18 Jul 2013 13:37:27 -0400 (EDT) X-AuditID: cbfec373-b7fca6d0000018b9-2e-51e827d7c13a Received: from ussync1.samsung.com ( [203.254.195.81]) by uscpsbgm2.samsung.com (USCPMTA) with SMTP id 9A.3F.06329.7D728E15; Thu, 18 Jul 2013 13:37:27 -0400 (EDT) Received: from lgflarrahondo ([105.140.33.168]) by ussync1.samsung.com (Oracle Communications Messaging Server 7u4-23.01 (7.0.4.23.0) 64bit (built Aug 10 2011)) with ESMTPA id <0MQ5001B27MFTS80@ussync1.samsung.com> for user@hadoop.apache.org; Thu, 18 Jul 2013 13:37:27 -0400 (EDT) From: German Florez-Larrahondo To: user@hadoop.apache.org References: <51E81FBA.5070408@cse.ohio-state.edu> In-reply-to: Subject: RE: Fault tolerance and Speculative Execution Date: Thu, 18 Jul 2013 12:37:38 -0500 Message-id: <00a801ce83dd$7fd416a0$7f7c43e0$@samsung.com> MIME-version: 1.0 Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7bit X-Mailer: Microsoft Outlook 14.0 Thread-index: AQJsL62ipBCvPdmM4ISgBU2FOR9NYgFT5iSxmCS2S4A= Content-language: en-us X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrMLMWRmVeSWpSXmKPExsVy+t/hQN3r6i8CDVZsMrDomTKNxYHRY0LX FsYAxigum5TUnMyy1CJ9uwSujI3LX7IVrBOouPDrLnMD417eLkZODgkBE4lJi4+xQNhiEhfu rWcDsYUEljBKnF9R2MXIBWQvYJJ4tbcVLMEmYCbxu6OBGcQWEZCS6H4zmQmioUJi1aRdjF2M HBycAsESK6/qg4SFgcon918Hm88ioCqx+fBBsHJeAUuJ3ysPMULYghI/Jt8Dq2EW0JJYv/M4 E4QtL7F5zVtmiNsUJHacfc0IsdZKYvWpT4wQNeISkx48ZJ/AKDgLyahZSEbNQjJqFpKWBYws qxhFS4uTC4qT0nON9IoTc4tL89L1kvNzNzFCArZ4B+OLDVaHGAU4GJV4eB98eR4oxJpYVlyZ e4hRgoNZSYR37U+gEG9KYmVValF+fFFpTmrxIUYmDk6pBsbgfkP7/jdqiS6ys1bZ+E55s+Tr szD995pKadyu0aVcFdkBazW62ZIqur4nm02/Y5Cpe91M7za71e13li+eMFeVyr+/ccpwS2CN +nuT3DXZiw6YxDIvETLNu3DV9pVZxT5zkYnZwhOWcirVHfubp5PY4TulSVXectf6lxJd23Pk il34jb5lKLEUZyQaajEXFScCANveTnw2AgAA X-Virus-Checked: Checked by ClamAV on apache.org Also, a simple explanation of how speculative execution works and what are the key settings can be found here: http://books.google.com/books?id=Wu_xeGdU4G8C&pg=PA216&dq=hadoop+the+definit ive+guide+speculative+execution&hl=en&sa=X&ei=lyLoUd7bDojk9gTC1YDIBA&ved=0CD wQ6AEwAA In addition, there used to be other parameters (slownodethreshold, slowtaskthreshold & speculativecap) http://coffee2idea.blogspot.com/2011/11/hadoop-speculative-execution.html but I believe they were deprecated... Regards German ./g -----Original Message----- From: Harsh J [mailto:harsh@cloudera.com] Sent: Thursday, July 18, 2013 12:11 PM To: Subject: Re: Fault tolerance and Speculative Execution What you describe in the first paragraph is not true. Speculative execution API toggles are listed in the documentation: http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Configuration and in the mapred-default page in property form: http://hadoop.apache.org/docs/stable/mapred-default.html. Speculative execution is enabled by default. On Thu, Jul 18, 2013 at 10:32 PM, Sundeep Kambhampati wrote: > Hi all, > Is it true that Hadoop 'always' starts same map tasks multiple times > in order to be fault tolerant. i.e. same task is launched on several > machines so that even if a node fails then same task would be > available on other node. And in case no node fails redundant task that finishes late is killed. > If it is true how can I change that configuration for Hadoop to do it > or not do it. > > Speculative execution on the other hand does what I explained above > (redundant map tasks) but only after all the map tasks are scheduled > and if some nodes are free it starts redundant map tasks for those > which are running slow. Is it always true? How do change this > configuration enable/disable. > > I am using Hadoop-1.1.2 incase version matters. > > I really appreciate if someone could help me with this. Thank you. > > Regards > Sundeep > > -- Harsh J