hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Falgout <jim.falg...@pervasive.com>
Subject RE: HDFS block size v.s. mapred.min.split.size
Date Thu, 17 Feb 2011 22:32:03 GMT
Generally, if you have large files, setting the block size to 128M or larger is helpful. You
can do that on a per file basis or set the block size for the whole filesystem. The larger
block size cuts down on the number of map tasks required to handle the overall data size.
I've experimented with mapred.min.split.size also and have usually found that the larger the
split size, the better the overall run time. Of course there is a cut off point, especially
with a very large cluster where larger split sizes will hurt overall scalability.

On tests I've run on a 10 and 20 node cluster though, setting the split size as high as 1GB
has allows the overall Hadoop jobs to run faster, sometimes quite a bit faster. You will lose
some locality, but it seems a trade off with the number of files that have to be shuffled
for the reduction step.

-----Original Message-----
From: Boduo Li [mailto:birdeeyore@gmail.com] 
Sent: Thursday, February 17, 2011 12:01 PM
To: common-user@hadoop.apache.org
Subject: HDFS block size v.s. mapred.min.split.size

Hi,

I'm recently benchmarking Hadoop. I know two ways to control the input data size for each
map task(): by changing the HDFS block size (have to reload data into HDFS in this case),
or by setting mapred.min.split.size.

For my benchmarking task, I need to change the input size for a map task frequently. Changing
HDFS block size and reloading data is really painful.
But using mapred.min.split.size seems to be problematic. I did some simple test to verify
if Hadoop has similar performance in the following cases:

(1) HDFS block size = 32MB, mapred.min.split.size=64MB (mapred.min.split.size can be only
set to larger than HDFS block size)

(2) HDFS block size = 64MB, mapred.min.split.size is not set

I ran the same job under these settings. Setting (1) takes 1374s to finish.
Setting (2) takes 1412s to finish.

I do understand that, given smaller HDFS block size, the I/O is more random.
But the 50-second difference seems too much for random I/O of input data.
Does anyone have any insight of it? Or does anyone know a better way to control the input
size of each map task?

Thanks.


Mime
View raw message