hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From elton sky <eltonsky9...@gmail.com>
Subject Re: increase BytesPerChecksum decrease write performance??
Date Mon, 11 Oct 2010 23:50:30 GMT
Hi Hairong,

I am using 0.20.2. And I set "dfs.write.packet.size" to : 512B, 32KB, 64KB,
256KB, 512KB, 2MB, 8MB and keep the BytesPerChecksum as 512B as default. And
I got similar result as before.

I think the problem is on the packet size, which is the size of buffer for
each write/read on the pipeline.
any idea?

BTW: if "dfs.write.packet.size" in 0.20.2 equals "
dfs.client-write-packet-size " in 0.21

On Tue, Oct 12, 2010 at 3:44 AM, Hairong Kuang <kuang.hairong@gmail.com>wrote:

>  This might be caused by the default wirte packet size. In HDFS, user data
> are pipeline to datanodes in packets. The default packet size is 64K. If the
> chunksize is bigger than 64K, the packet size automatically adjusts to
> include at least one chunk.
>
> Please set the packet size to be 8MB by configuring
> dfs.client-write-packet-size (in trunk) and rerun your experiments.
>
> Hairong
>
>
> On 10/8/10 9:42 PM, "elton sky" <eltonsky9404@gmail.com> wrote:
>
> Hello,
>
> I was benchmarking write/read of HDFS.
>
> I changed the chunksize, i.e. bytesPerChecksum or bpc, and create a 1G file
> with 128MB block size. The bpc I used: 512B, 32KB, 64KB, 256KB, 512KB, 2MB,
> 8MB.
>
> The result surprised me. The performance for 512B, 32KB, 64KB are quite
> similar, and then, as the increase of the bpc size the throughput decreases.
> And comparing 512B to 8MB, there's a 40% to 50% difference in throughput.
>
> Is there any idea for this?
>
>

Mime
View raw message