hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "M. C. Srivas" <mcsri...@gmail.com>
Subject Re: the performance of HDFS
Date Tue, 25 Jan 2011 22:49:42 GMT
On Tue, Jan 25, 2011 at 12:33 PM, Da Zheng <zhengda1936@gmail.com> wrote:

> Hello,
> I try to measure the performance of HDFS, but the writing rate is quite
> low. When the replication factor is 1, the rate of writing to HDFS is about
> 60MB/s. When the replication factor is 3, the rate drops significantly to
> about 15MB/s. Even though the actual rate of writing data to the disk is
> about 45MB/s, it's still much lower than when replication factor is 1. The
> link between two nodes in the cluster is 1Gbps. CPU is Dual-Core AMD
> Opteron(tm) Processor 2212, so CPU isn't bottleneck either. I thought I
> should be able to saturate the disk very easily. I wonder where the
> bottleneck is. What is the throughput for writing on a Hadoop cluster when
> the replication factor is 3?

The numbers above seem correct as per my observations.  If your data is
3-way replicated, the data-node writes about 3x the actual data written.
Conversely, your write-rate will be limited to 1/3 of  how fast the disk can
write, minus some overhead for replication.

The aggregate write-rate can get much higher if you use more drives, but a
single stream throughput is limited to the speed of one disk spindle.

> Thanks,
> Da

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message