hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Bockelman <bbock...@cse.unl.edu>
Subject Re: Poor IO performance on a 10 node cluster.
Date Mon, 30 May 2011 15:19:24 GMT

On May 30, 2011, at 7:27 AM, Gyurib√°csi wrote:

> Hi,
> I have a 10 node cluster (IBM blade servers, 48GB RAM, 2x500GB Disk, 16 HT
> cores).
> I've uploaded 10 files to HDFS. Each file is 10GB. I used the streaming jar
> with 'wc -l' as mapper and 'cat' as reducer.
> I use 64MB block size and the default replication (3).
> The wc on the 100 GB took about 220 seconds which translates to about 3.5
> Gbit/sec processing speed. One disk can do sequential read with 1Gbit/sec so
> i would expect someting around 20 GBit/sec (minus some overhead), and I'm
> getting only 3.5.
> Is my expectaion valid?

Probably not, at least not out-of-the-box.  Things to tune:
1) Number of threads per disk
2) Size of the block size (yours seems relatively small)
3) Network bottlenecks (can be seen using Ganglia)
4) Related to (3), the number of replicas.
5) Selection of the Linux I/O scheduler.  Default CFQ scheduler is inappropriate for batch

Finally, if you don't have enough host-level monitoring to indicate the current bottleneck
(CPU, memory, network, or I/O?), you likely won't ever be able to solve this riddle

View raw message