hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yanbo Liang <yanboha...@gmail.com>
Subject Re: DFSOutputStream.sync() method latency time
Date Thu, 28 Mar 2013 15:40:05 GMT
1st when client wants to write data to HDFS, it should be create
Then the client write data to this output stream and this stream will
transfer data to all DataNodes with the constructed pipeline by the means
of Packet whose size is 64KB.
These two operations is concurrent, so the write latency is not simple

2nd the sync method only flush the last packet ( at most 64KB ) data to the

Because of the cocurrent processing of all these operations, so the latency
is smaller than the superposition of each operation.
It's parallel computing rather than serial computing in a sense.

2013/3/28 lei liu <liulei412@gmail.com>

> When client  write data, if there are three replicates,  the sync method
> latency time formula should be:
> sync method  latency time = first datanode receive data time + sencond
> datanode receive data  time +  third datanode receive data time.
> if the three datanode receive data time all are 2 millisecond, so the sync
> method  latency time should is 6 millisecond,  but according to our our
> monitor, the the sync method  latency time is 2 millisecond.
> How to calculate sync method  latency time?
> Thanks,
> LiuLei

View raw message