hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer <awittena...@linkedin.com>
Subject Re: Data Block Size ?
Date Thu, 15 Jul 2010 18:49:04 GMT

On Jul 15, 2010, at 11:40 AM, Syed Wasti wrote:

> Will it matter what the data block size is ? 

Yes.

> It is recommended to have a block size of 64 MB, but if we want to have the data block
size to 128 MB, should this effect the performance ?

Yes.

FWIW, we run with 128MB.

> Does the size of the map jobs created on each datanodes in anyway depend the block size
?

Yes.

Unless told otherwise, Hadoop will generally use the # of maps == # of blocks.  So if you
have fewer blocks to process, you'll have fewer maps to do more work.  This is not necessarily
a bad thing; it all depends upon your workload, size of grid, etc.


Mime
View raw message