Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of bejoy.hadoop@gmail.com
 designates 209.85.223.176 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CACD21EMULgxKq-UcBK1iWWJ68D2wi9u=mBWqGOeBcid5=MJn=A@mail.gmail.com>
References: <1349195676.38555.YahooMailNeo@web125301.mail.ne1.yahoo.com>
	<CACD21EMULgxKq-UcBK1iWWJ68D2wi9u=mBWqGOeBcid5=MJn=A@mail.gmail.com>
Date: Tue, 2 Oct 2012 22:33:48 +0530
Message-ID: 
 <CACD21EMSjL5LTiYpa--hLKx7uMCV_M7Kt7iVrFH61tHqOtuWzw@mail.gmail.com>
Subject: Re: How to lower the total number of map tasks
From: Bejoy Ks <bejoy.hadoop@gmail.com>
To: user@hadoop.apache.org, Shing Hing Man <matmsh@yahoo.com>
Content-Type: multipart/alternative; boundary=bcaec5186fea78f44304cb1681fc

--bcaec5186fea78f44304cb1681fc
Content-Type: text/plain; charset=ISO-8859-1

Sorry for the typo, the property name is mapred.max.split.size

Also just for changing the number of map tasks you don't need to modify the
hdfs block size.

On Tue, Oct 2, 2012 at 10:31 PM, Bejoy Ks <bejoy.hadoop@gmail.com> wrote:

> Hi
>
> You need to alter the value of mapred.max.split size to a value larger
> than your block size to have less number of map tasks than the default.
>
>
> On Tue, Oct 2, 2012 at 10:04 PM, Shing Hing Man <matmsh@yahoo.com> wrote:
>
>>
>>
>>
>> I am running Hadoop 1.0.3 in Pseudo  distributed mode.
>> When I  submit a map/reduce job to process a file of  size about 16 GB,
>> in job.xml, I have the following
>>
>>
>> mapred.map.tasks =242
>> mapred.min.split.size =0
>> dfs.block.size = 67108864
>>
>>
>> I would like to reduce   mapred.map.tasks to see if it improves
>> performance.
>> I have tried doubling  the size of  dfs.block.size. But
>> the    mapred.map.tasks remains unchanged.
>> Is there a way to reduce  mapred.map.tasks  ?
>>
>>
>> Thanks in advance for any assistance !
>> Shing
>>
>>
>

--bcaec5186fea78f44304cb1681fc
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Sorry for the typo, the property name is=A0mapred.max.split.size<div><br></=
div><div>Also just for=A0changing=A0the number of map tasks you don&#39;t n=
eed to modify the hdfs block size.<br><br><div class=3D"gmail_quote">On Tue=
, Oct 2, 2012 at 10:31 PM, Bejoy Ks <span dir=3D"ltr">&lt;<a href=3D"mailto=
:bejoy.hadoop@gmail.com" target=3D"_blank">bejoy.hadoop@gmail.com</a>&gt;</=
span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Hi<div><br></div><div>You need to alter the =
value of mapred.max.split size to a value larger than your block size to ha=
ve less number of map tasks than the default.<div>
<div class=3D"h5"><br><br><div class=3D"gmail_quote">On Tue, Oct 2, 2012 at=
 10:04 PM, Shing Hing Man <span dir=3D"ltr">&lt;<a href=3D"mailto:matmsh@ya=
hoo.com" target=3D"_blank">matmsh@yahoo.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><br>
<br>
<br>
I am running Hadoop 1.0.3 in Pseudo=A0=A0distributed mode.<br>
When I=A0=A0submit a map/reduce job to process a file of=A0=A0size about 16=
 GB, in job.xml, I have the following<br>
<br>
<br>
mapred.map.tasks =3D242<br>
mapred.min.split.size =3D0<br>
dfs.block.size =3D 67108864<br>
<br>
<br>
I would like to reduce=A0=A0 mapred.map.tasks to see if it improves perform=
ance.<br>
I have tried doubling=A0=A0the size of=A0=A0dfs.block.size. But the=A0=A0=
=A0=A0mapred.map.tasks remains unchanged.<br>
Is there a way to reduce=A0=A0mapred.map.tasks=A0=A0?<br>
<br>
<br>
Thanks in advance for any assistance ! =A0<br>
<span><font color=3D"#888888">Shing<br>
<br>
</font></span></blockquote></div><br></div></div></div>
</blockquote></div><br></div>

--bcaec5186fea78f44304cb1681fc--