Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8B3B6D59E for ; Tue, 2 Oct 2012 18:18:10 +0000 (UTC) Received: (qmail 3674 invoked by uid 500); 2 Oct 2012 18:18:06 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 3562 invoked by uid 500); 2 Oct 2012 18:18:05 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 3554 invoked by uid 99); 2 Oct 2012 18:18:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Oct 2012 18:18:05 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [98.138.91.60] (HELO nm22-vm0.bullet.mail.ne1.yahoo.com) (98.138.91.60) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Oct 2012 18:17:56 +0000 Received: from [98.138.90.56] by nm22.bullet.mail.ne1.yahoo.com with NNFMP; 02 Oct 2012 18:17:35 -0000 Received: from [98.138.89.249] by tm9.bullet.mail.ne1.yahoo.com with NNFMP; 02 Oct 2012 18:17:35 -0000 Received: from [127.0.0.1] by omp1041.mail.ne1.yahoo.com with NNFMP; 02 Oct 2012 18:17:35 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 372405.16399.bm@omp1041.mail.ne1.yahoo.com Received: (qmail 72353 invoked by uid 60001); 2 Oct 2012 18:17:35 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1349201855; bh=lLt6dOUwVC+goTPcA2ev2o5njb/WsBy33sFF6xwDdrQ=; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=tSS6D9HeV1hJ4COVH/2VOqjw326VAnDBnGqbaX3Ev4UDhPBPIYRqV74185Fzqq/KyIRd8YP6RgmSOWrNEFffmljpS7qiwxohqJWLJWnlKHBdDYVlGlZOT1zT4IT1hsS3Xa1iG6v1u1yB5keR2KYTD5wNjHYpRZLWcJHi/xFys9c= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=ygkt8nqFo7l/rXM5hmnNl/HVlCti9m5yT993ljDJrFYUza7htJxXcpoAe6rAdCgQWTwSQSHzbZE3YTfWOkWiqgr3oaPFwcXIOqIvA8CRz8b1SnDCA/nkJd0c5bvbFVG5ncDlYG6r/kjJIN4nZoQUCzbyuMnR1+BaKX16xGmX5QY=; X-YMail-OSG: tq7qVrYVM1kjGIbz8bixtbpw0PFygtYfhJZM._l21U_kTH6 g99dzzvIo Received: from [62.49.31.174] by web125305.mail.ne1.yahoo.com via HTTP; Tue, 02 Oct 2012 11:17:35 PDT X-Mailer: YahooMailWebService/0.8.121.434 References: <1349195676.38555.YahooMailNeo@web125301.mail.ne1.yahoo.com> <1349199539.93871.YahooMailNeo@web125304.mail.ne1.yahoo.com> <1600482676-1349199982-cardhu_decombobulator_blackberry.rim.net-614239968-@b3.c16.bise7.blackberry> Message-ID: <1349201855.38092.YahooMailNeo@web125305.mail.ne1.yahoo.com> Date: Tue, 2 Oct 2012 11:17:35 -0700 (PDT) From: Shing Hing Man Reply-To: Shing Hing Man Subject: Re: How to lower the total number of map tasks To: "user@hadoop.apache.org" , "bejoy.hadoop@gmail.com" In-Reply-To: <1600482676-1349199982-cardhu_decombobulator_blackberry.rim.net-614239968-@b3.c16.bise7.blackberry> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="-1179957465-263031986-1349201855=:38092" X-Virus-Checked: Checked by ClamAV on apache.org ---1179957465-263031986-1349201855=:38092 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable I only have one big input file.=0A=0AShing =0A=0A=0A_______________________= _________=0A From: Bejoy KS =0ATo: user@hadoop.apac= he.org; Shing Hing Man =0ASent: Tuesday, October 2, 2012= 6:46 PM=0ASubject: Re: How to lower the total number of map tasks=0A =0A= =0AHi Shing=0A=0AIs your input a single file or set of small files? If latt= er you need to use CombineFileInputFormat.=0A=0A=0ARegards=0ABejoy KS=0A=0A= Sent from handheld, please excuse typos.=0A________________________________= =0A=0AFrom: Shing Hing Man =0ADate: Tue, 2 Oct 2012 10:= 38:59 -0700 (PDT)=0ATo: user@hadoop.apache.org=0ARe= plyTo: user@hadoop.apache.org =0ASubject: Re: How to lower the total numbe= r of map tasks=0A=0A=0AI have tried =0A=A0=A0=A0=A0=A0=A0 Configuration.set= Int("mapred.max.split.size",134217728);=0A=0Aand setting mapred.max.split.s= ize in mapred-site.xml. ( dfs.block.size is left unchanged at 67108864).=0A= =0ABut in the job.xml, I am still getting mapred.map.tasks =3D242 .=0A=0ASh= ing =0A=0A=0A=0A=0A=0A=0A________________________________=0A From: Bejoy Ks= =0ATo: user@hadoop.apache.org; Shing Hing Man =0ASent: Tuesday, October 2, 2012 6:03 PM=0ASubject: Re: How= to lower the total number of map tasks=0A =0A=0ASorry for the typo, the pr= operty name is=A0mapred.max.split.size=0A=0AAlso just for=A0changing=A0the = number of map tasks you don't need to modify the hdfs block size.=0A=0A=0AO= n Tue, Oct 2, 2012 at 10:31 PM, Bejoy Ks wrote:=0A= =0AHi=0A>=0A>=0A>You need to alter the value of mapred.max.split size to a = value larger than your block size to have less number of map tasks than the= default.=0A>=0A>=0A>=0A>On Tue, Oct 2, 2012 at 10:04 PM, Shing Hing Man wrote:=0A>=0A>=0A>>=0A>>=0A>>I am running Hadoop 1.0.3 in = Pseudo=A0=A0distributed mode.=0A>>When I=A0=A0submit a map/reduce job to pr= ocess a file of=A0=A0size about 16 GB, in job.xml, I have the following=0A>= >=0A>>=0A>>mapred.map.tasks =3D242=0A>>mapred.min.split.size =3D0=0A>>dfs.b= lock.size =3D 67108864=0A>>=0A>>=0A>>I would like to reduce=A0=A0 mapred.ma= p.tasks to see if it improves performance.=0A>>I have tried doubling=A0=A0t= he size of=A0=A0dfs.block.size. But the=A0=A0=A0=A0mapred.map.tasks remains= unchanged.=0A>>Is there a way to reduce=A0=A0mapred.map.tasks=A0=A0?=0A>>= =0A>>=0A>>Thanks in advance for any assistance ! =A0=0A>>Shing=0A>>=0A>>=0A= > ---1179957465-263031986-1349201855=:38092 Content-Type: text/html; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable
I only have one big i= nput file.

Shing

From: Bejoy KS <bejoy.hadoop@gmail.com>
To: user@hadoop.apache.org; Shi= ng Hing Man <matmsh@yahoo.com>
Sent: Tuesday, October 2, 2012 6:46 PM
Subject: Re: How to lower the total numb= er of map tasks

Hi Shi= ng

Is your input a single file or set of small files? If latter you = need to use CombineFileInputFormat.

Regards
Bejoy KS

Sent from h= andheld, please excuse typos.

From: Shing Hing Man &l= t;matmsh@yahoo.com>=0A
Date: Tue, 2 Oct 2012 10:38:59 -= 0700 (PDT)
To: user@hadoop.apache.org<user@hadoop.apach= e.org>
ReplyTo: user@hadoop.apache.org=0A
Subject: Re: How to lower the total number of map tasks

=

I have tried
 &= nbsp;     Configuration.setInt("mapred.max.split.size",= 134217728);

and setting mapred.max.split.size in mapred-site.xml. ( = dfs.block.size is left unchanged at 67108864).

But in the job.xml, I= am still getting mapred.map.tasks =3D242 .

Shing



From: Bejoy = Ks <bejoy.hadoop@gmail.com>
= To: user@hadoop.apache.org; Shing Hing Man <matmsh@yahoo.com>
Sent: Tuesday,=0A October 2, 2012 6:03 PM
Subject: Re: How to lower the total number of map ta= sks

Sorry for the typo, the= property name is mapred.max.split.size

Also just f= or changing the number of map tasks you don't need to modify the = hdfs block size.

On Tue, Oct = 2, 2012 at 10:31 PM, Bejoy Ks <bejoy.hadoop@gmail.com> wrote:
=0AHi

You need to = alter the value of mapred.max.split size to a value larger than your block = size to have less number of map tasks than the default.
=0A


On Tue, = Oct 2, 2012 at 10:04 PM, Shing Hing Man <matmsh@yahoo.com> wrote:
=0A
=0A
=0A
=0AI am running Hadoop 1= .0.3 in Pseudo  distributed mode.
=0AWhen I  submit = a map/reduce job to process a file of  size about 16 GB, in job.x= ml, I have the following
=0A
=0A
=0Amapred.map.tasks =3D242
=0A= mapred.min.split.size =3D0
=0Adfs.block.size =3D 67108864
=0A
=0A<= br>=0AI would like to reduce   mapred.map.tasks to see if it impr= oves performance.
=0AI have tried doubling  the size of &= nbsp;dfs.block.size. But the    mapred.map.tasks remain= s unchanged.
=0AIs there a way to reduce  mapred.map.tasks&nbs= p; ?
=0A
=0A
=0AThanks in advance for any assistance !  =
=0AShing
=0A
=0A

=0A

=0A

=0A


---1179957465-263031986-1349201855=:38092--