hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shing Hing Man <mat...@yahoo.com>
Subject Re: How to lower the total number of map tasks
Date Tue, 02 Oct 2012 17:33:45 GMT

 I set the block size using 

I have also set it  in mapred-site.xml.


 From: Chris Nauroth <cnauroth@hortonworks.com>
To: user@hadoop.apache.org; Shing Hing Man <matmsh@yahoo.com> 
Sent: Tuesday, October 2, 2012 6:00 PM
Subject: Re: How to lower the total number of map tasks

Those numbers make sense, considering 1 map task per block.  16 GB file / 64 MB block size
= ~242 map tasks.

When you doubled dfs.block.size, how did you accomplish that?  Typically, the block size
is selected at file write time, with a default value from system configuration used if not
specified.  Did you "hadoop fs -put" the file with the new block size, or was it something

Thank you,

On Tue, Oct 2, 2012 at 9:34 AM, Shing Hing Man <matmsh@yahoo.com> wrote:

>I am running Hadoop 1.0.3 in Pseudo  distributed mode.
>When I  submit a map/reduce job to process a file of  size about 16 GB, in job.xml,
I have the following
>mapred.map.tasks =242
>mapred.min.split.size =0
>dfs.block.size = 67108864
>I would like to reduce   mapred.map.tasks to see if it improves performance.
>I have tried doubling  the size of  dfs.block.size. But the    mapred.map.tasks
remains unchanged.
>Is there a way to reduce  mapred.map.tasks  ?
>Thanks in advance for any assistance !  
View raw message