hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bejoy KS" <bejoy.had...@gmail.com>
Subject Re: How to lower the total number of map tasks
Date Tue, 02 Oct 2012 17:37:36 GMT

This doesn't change the block size of existing files in hdfs, only new files written to hdfs
will be affected. To get this in effect for old files you need to re copy them atleast within
hadoop fs -cp src destn.

Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: Shing Hing Man <matmsh@yahoo.com>
Date: Tue, 2 Oct 2012 10:33:45 
To: user@hadoop.apache.org<user@hadoop.apache.org>
Reply-To: user@hadoop.apache.org
Subject: Re: How to lower the total number of map tasks

 I set the block size using 

I have also set it  in mapred-site.xml.


 From: Chris Nauroth <cnauroth@hortonworks.com>
To: user@hadoop.apache.org; Shing Hing Man <matmsh@yahoo.com> 
Sent: Tuesday, October 2, 2012 6:00 PM
Subject: Re: How to lower the total number of map tasks

Those numbers make sense, considering 1 map task per block.  16 GB file / 64 MB block size
= ~242 map tasks.

When you doubled dfs.block.size, how did you accomplish that?  Typically, the block size
is selected at file write time, with a default value from system configuration used if not
specified.  Did you "hadoop fs -put" the file with the new block size, or was it something

Thank you,

On Tue, Oct 2, 2012 at 9:34 AM, Shing Hing Man <matmsh@yahoo.com> wrote:

>I am running Hadoop 1.0.3 in Pseudo  distributed mode.
>When I  submit a map/reduce job to process a file of  size about 16 GB, in job.xml,
I have the following
>mapred.map.tasks =242
>mapred.min.split.size =0
>dfs.block.size = 67108864
>I would like to reduce   mapred.map.tasks to see if it improves performance.
>I have tried doubling  the size of  dfs.block.size. But the    mapred.map.tasks
remains unchanged.
>Is there a way to reduce  mapred.map.tasks  ?
>Thanks in advance for any assistance !  
View raw message