hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Liochon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11492) The servers do not honor the tcpNoDelay option
Date Fri, 11 Jul 2014 10:03:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14058600#comment-14058600

Nicolas Liochon commented on HBASE-11492:

(on the bug)
I've tried different jdk7 versions, they all have the same behavior. If I understand well
Esteban tested the jdk6?

Searching in google I have
"channel.socket().setTcpNoDelay" => 4400 results
"channel.setOption(StandardSocketOptions.TCP_NODELAY" => 60 results

If it's representative, it could be classified as a documentation bug at the very least I
But the jdk bug database contains nothing on this. On the web, multiple documentations mention
setOption with nio, without alerting that channel.socket().setTcpNoDelay won't work. I haven't
found anyone complaining or alerting on this (well someone has to be the first :-) )

But yeah, it's a pure nio thing. with oio using the socket works (that's why the hbase client
sets nagle correctly for its side of the connection).

bq. Should we set both of these just in case if this is a JDK issue indeed?
In anycase using the setOption is fine. Requiring the socket.settcpnodelay would be a regression.

bq. Wouldn't other TCP flags be affected in the same way too?
It seems not for the keep alive. I haven't tested in details however, so I may be wrong.

bq. since HDFS + HBase always use this style, right?
I think it's because of the OIO thing: it works fine until you use the NIO.
As well, Nagle is tricky, because the first packet is sent immediately, the wait is on the
next ones. Nevertheless the issue is quite severe.

bq.  so weird
That's for sure :-)

[~andrew.purtell@gmail.com] What about the 0.98? An intermediate patch would be to default
back the value to false if we fear that activating 'for real' the nodelay would be too risky.
 [~lhofhansl] fyi. 

> The servers do not honor the tcpNoDelay option
> ----------------------------------------------
>                 Key: HBASE-11492
>                 URL: https://issues.apache.org/jira/browse/HBASE-11492
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.92.2, 0.98.0, 0.96.0, 0.99.0, 0.94.20
>            Reporter: Nicolas Liochon
>            Assignee: Nicolas Liochon
>            Priority: Critical
>             Fix For: 0.99.0, 0.98.5
>         Attachments: 11492.v1.patch
> There is an option to set tcpNoDelay, defaulted to true, but the socket channel is actually
not changed. As a consequence, the server works with nagle enabled. This leads to very degraded
behavior when a single connection is shared between threads. We enter into conflicts with
nagle and tcp delayed ack. 
> Here is an example of performance with the PE tool plus HBASE-11491:
> {noformat}
> oneCon     #client       sleep          exeTime (seconds)                           
 avg latency, sleep excluded (microseconds)
> true           1               0                31                                  
> false          1               0                31                                  
> true           2               0                50                                  
> false          2               0               31                                   
> true           2                5               488 (including 200s sleeping)       
> false          2               5               246  (including 200s sleeping)       
> {noformat}
> The latency is multiple by 5 (2880 vs 460) when the connection is shared. This is the
delayed ack kicking in. This can be fixed by really using tcp no delay.
> Any application sharing the tcp connection between threads has the issue.

This message was sent by Atlassian JIRA

View raw message