Those numbers make sense, considering 1 map task per block.  16 GB file / 64 MB block size = ~242 map tasks.

When you doubled dfs.block.size, how did you accomplish that?  Typically, the block size is selected at file write time, with a default value from system configuration used if not specified.  Did you "hadoop fs -put" the file with the new block size, or was it something else?

Thank you,

On Tue, Oct 2, 2012 at 9:34 AM, Shing Hing Man <> wrote:

I am running Hadoop 1.0.3 in Pseudo  distributed mode.
When I  submit a map/reduce job to process a file of  size about 16 GB, in job.xml, I have the following =242
mapred.min.split.size =0
dfs.block.size = 67108864

I would like to reduce to see if it improves performance.
I have tried doubling  the size of  dfs.block.size. But the remains unchanged.
Is there a way to reduce  ?

Thanks in advance for any assistance !