hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: 0.92 and Read/writes not scaling
Date Tue, 03 Apr 2012 17:42:07 GMT
On Tue, Apr 3, 2012 at 9:56 AM, Jonathan Hsieh <jon@cloudera.com> wrote:
> The hypothesis was that since I was seeing TCP ack delays in ganglia, it
> may have to do with the TCP_NODELAY setting on the write side.   The hdfs
> client sets this in the read side DFSInputStream, here but not on the
> DFSOutputStream write side:
>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java#L836
>
>
> // TCP_NODELAY is crucial here because of bad interactions between
> // Nagle's Algorithm and Delayed ACKs. With connection keepalive
> // between the client and DN, the conversation looks like:
> // 1. Client -> DN: Read block X
> // 2. DN -> Client: data for block X
> // 3. Client -> DN: Status OK (successful read)
> // 4. Client -> DN: Read block Y
> // The fact that step #3 and #4 are both in the client->DN direction
> // triggers Nagling. If the DN is using delayed ACKs, this results
> // in a delay of 40ms or more.
> //
>
> The fact that I am getting ackDelays on a write test may indicate that we
> need this set TCP_NODELAY on the HBase HLog write side --
> (HDFS's DFSClient.DFSOutputStream in hadoop 0.20.x and DFSOutputStream in
> 0.23.)  I did a quick hack and test adding socket.setNoTcpDelay(true) on
> that write side of a hadoop 0.20.x and reran the PE tests; unfortunately,
> we still seem to have the socketTimeoutException problems.  Needs more
> digging..
>

Thanks Jon for the above.
St.Ack

Mime
View raw message