hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bejoy KS" <bejoy.had...@gmail.com>
Subject Re: How to lower the total number of map tasks
Date Tue, 02 Oct 2012 17:46:07 GMT
Hi Shing

Is your input a single file or set of small files? If latter you need to use CombineFileInputFormat.

Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: Shing Hing Man <matmsh@yahoo.com>
Date: Tue, 2 Oct 2012 10:38:59 
To: user@hadoop.apache.org<user@hadoop.apache.org>
Reply-To: user@hadoop.apache.org
Subject: Re: How to lower the total number of map tasks

I have tried 

and setting mapred.max.split.size in mapred-site.xml. ( dfs.block.size is left unchanged at

But in the job.xml, I am still getting mapred.map.tasks =242 .


 From: Bejoy Ks <bejoy.hadoop@gmail.com>
To: user@hadoop.apache.org; Shing Hing Man <matmsh@yahoo.com> 
Sent: Tuesday, October 2, 2012 6:03 PM
Subject: Re: How to lower the total number of map tasks

Sorry for the typo, the property name is mapred.max.split.size

Also just for changing the number of map tasks you don't need to modify the hdfs block size.

On Tue, Oct 2, 2012 at 10:31 PM, Bejoy Ks <bejoy.hadoop@gmail.com> wrote:

>You need to alter the value of mapred.max.split size to a value larger than your block
size to have less number of map tasks than the default.
>On Tue, Oct 2, 2012 at 10:04 PM, Shing Hing Man <matmsh@yahoo.com> wrote:
>>I am running Hadoop 1.0.3 in Pseudo  distributed mode.
>>When I  submit a map/reduce job to process a file of  size about 16 GB, in job.xml,
I have the following
>>mapred.map.tasks =242
>>mapred.min.split.size =0
>>dfs.block.size = 67108864
>>I would like to reduce   mapred.map.tasks to see if it improves performance.
>>I have tried doubling  the size of  dfs.block.size. But the    mapred.map.tasks
remains unchanged.
>>Is there a way to reduce  mapred.map.tasks  ?
>>Thanks in advance for any assistance !  
View raw message