Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9ADF0FE43 for ; Sat, 13 Apr 2013 01:49:14 +0000 (UTC) Received: (qmail 20014 invoked by uid 500); 13 Apr 2013 01:49:09 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 19661 invoked by uid 500); 13 Apr 2013 01:49:09 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 19654 invoked by uid 99); 13 Apr 2013 01:49:09 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 13 Apr 2013 01:49:09 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [87.230.46.220] (HELO vwp3725.webpack.hosteurope.de) (87.230.46.220) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 13 Apr 2013 01:49:03 +0000 Received: from dhcp100-44-129-123.hil-sfofwhf.sfo.wayport.net ([100.44.129.123]); authenticated by vwp3725.webpack.hosteurope.de running ExIM with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) id 1UQpa6-0007Ec-22; Sat, 13 Apr 2013 03:48:42 +0200 From: Kai Voigt Content-Type: multipart/alternative; boundary="Apple-Mail=_A4A3E99A-28A3-4B98-BC89-EE96EA949328" Message-Id: <0DBEF640-2DAE-448B-8F3A-A6D59DCEE3D6@123.org> Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\)) Subject: Re: 100K Maps scenario Date: Fri, 12 Apr 2013 18:48:38 -0700 References: <1364377874.13753.YahooMailNeo@web194703.mail.sg3.yahoo.com> <1364577771.12724.YahooMailNeo@web194704.mail.sg3.yahoo.com> <1364719534.91394.YahooMailNeo@web194703.mail.sg3.yahoo.com> <1365042870.89547.YahooMailNeo@web194702.mail.sg3.yahoo.com> <1365740112.75877.YahooMailNeo@web190702.mail.sg3.yahoo.com> <578478094-1365742754-cardhu_decombobulator_blackberry.rim.net-2001349931-@b16.c6.bise7.blackberry> <1365754060.82642.YahooMailNeo@web190701.mail.sg3.yahoo.com> <1365817528.61054.YahooMailNeo@web190703.mail.sg3.yahoo.com> To: user@hadoop.apache.org, Sai Sai In-Reply-To: <1365817528.61054.YahooMailNeo@web190703.mail.sg3.yahoo.com> X-Mailer: Apple Mail (2.1503) X-bounce-key: webpack.hosteurope.de;k@123.org;1365817743;0766bcbc; X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_A4A3E99A-28A3-4B98-BC89-EE96EA949328 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 No, only one copy of each block will be processed. If a task fails, it will be retried on another copy. Also, if = speculative execution is enabled, slow tasks might be executed twice in = parallel. But this will only happen rarely. Kai Am 12.04.2013 um 18:45 schrieb Sai Sai : >=20 > Just a follow up to see if anyone can shed some light on this: > My understanding is that each block after getting replicated 3 times, = a map task is run on each of the replica in parallel. > The thing i am trying to double verify is in a scenario where a file = is split into 10K or 100K or more blocks it will result in atleast 300K = Map tasks being performed and this looks like an overkill from a = performance or just a logical perspective.=20 > Will appreciate any thoughts on this. > Thanks > Sai > From: Sai Sai > To: "user@hadoop.apache.org" ; Sai Sai = =20 > Sent: Friday, 12 April 2013 1:37 PM > Subject: Re: Does a Map task run 3 times on 3 TTs or just once >=20 > Just wondering if it is right to assume that a Map task is run 3 times = on 3 different TTs in parallel and whoever completes processing the task = first that output is picked up and written to intermediate location. > Or is it true that a map task even though its data is replicated 3 = times will run only once and other 2 will be on the stand by just incase = this fails the second one will run followed by 3rd one if the 2nd Mapper = fails. > Plesae pour some light. > Thanks > Sai >=20 >=20 --=20 Kai Voigt k@123.org --Apple-Mail=_A4A3E99A-28A3-4B98-BC89-EE96EA949328 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 No, = only one copy of each block will be processed.

If a = task fails, it will be retried on another copy. Also, if speculative = execution is enabled, slow tasks might be executed twice in parallel. = But this will only happen = rarely.

Kai


Am 12.04.2013 um 18:45 schrieb Sai Sai <saigraph@yahoo.in>:


From: = Sai Sai <saigraph@yahoo.in>
To: "user@hadoop.apache.org" = <user@hadoop.apache.org>; = Sai Sai <saigraph@yahoo.in>=
Sent: Friday, 12 = April 2013 1:37 PM
Subject: Re: Does a Map task run 3 times on 3 TTs or = just once

Just wondering if it is right to assume that a Map task is = run 3 times on 3 different TTs in parallel and whoever completes = processing the task first that output is picked up and written to = intermediate location.
Or is it true that a map task even though its data is replicated 3 times will run only once and other = 2 will be on the stand by just incase this fails the second one will run = followed by 3rd one if the 2nd Mapper fails.
Plesae pour some = light.
Thanks
Sai


=

-- 
Kai Voigt

<= br class=3D"Apple-interchange-newline">


= --Apple-Mail=_A4A3E99A-28A3-4B98-BC89-EE96EA949328--