cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6235) Improve native protocol server latency
Date Mon, 04 Nov 2013 14:42:19 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13812872#comment-13812872
] 

Sylvain Lebresne commented on CASSANDRA-6235:
---------------------------------------------

We are. Though last time I tried hacking it to use netty 4 I saw no particular difference
in performance so I'm not too convinced about "update to netty 4 and all your problems will
go away". Besides, even if we update to netty 4 (which we will), it'll be for C* 2.1 but likely
not before that, so it would be kind of disappointing if that was indeed the problem.

> Improve native protocol server latency
> --------------------------------------
>
>                 Key: CASSANDRA-6235
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6235
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>         Attachments: NPTester.java
>
>
> The tl;dr is that the native protocol server seems to add some non negligeable latency
to operations compared to the thrift server. And the added latency seems to lie within Netty's
internal as far as I can tell. I'm not sure what to tweak to try to reduce that.
> The test I ran is simple: it's {{stress -t 1 -L3}}, the Cassandra stress test for insertions
with just 1 thread and using CQL-over-thrift (to make things more comparable). What I'm interested
in is the average latency. Also, because I don't care about testing the storage engine or
even CQL processing, I've disabled the processing of statements: all queries just return an
empty result set right away (there's no parsing of the query in particular). The resulting
branch is at https://github.com/pcmanus/cassandra/commits/latency-testing (note that there's
a trivial patch to have stress show the latency in microseconds).
> With that branch (single node), I get with thrift ~62μs of average latency. That number
is actually fairly stable across runs (not doing any real processing helps having consistent
performance here).
> For the native protocol, I wanted to eliminate the possibility that the DataStax Java
driver was the bottleneck so I wrote a very simple class (NPTester.java, attached) that emulates
the stress test above but with the native protocol. It's not execssively pretty but its simple
(no dependencies, compiles with javac NPTester.java) and it tries to minimize the client side
overhead. It's just a basic loop that write query frames (serializing them largely manually)
and read the result back. And it measures the latency as close to the socket as possible.
Unless I've done something really wrong, it should have less client side overhead than what
stress has.
> With that tester, the average latency I get is ~140μs. This is more than twice that
of thrift.
> To try to understand where that additional latency was spent, I "instrumented" the Frame
coder/decoder to record latencies (last commit of the latency-test branch above): it records
how long it takes to decode, execute and re-encode the query. The latency for that is ~35μs
(as other numbers above, this is pretty consistent over runs). Given that my ping on localhost
is <30μs, this suggest that compared to thrift, Netty spends ~70μs more than the thrift
server somewhere while reading and/or writing data on the wire. I've try yourkitting it but
I didn't saw anything obvious so I'm not sure what's the problem, but it sure would be nice
to get on par (or at least much closer) with thrift on such a simple test.
> I'll note that if I run the same tests without disabling actual query processing, the
tests have a bit more variability, but for thrift I get ~220-230μs latency on average while
the NPTester gets ~290-300μs. In other words, there still seems to be that 70μs overhead
for the native protocol. Which in that case is still a >30% slowdown. I'll also note that
test comparisons with more threads (using the java driver this time) also show the native
protocol being slightly slower than thrift (~5-10% slower), and while there might be inefficiencies
in the java driver, I'm growing more and more convinced that at least part of it is due to
the latency "issue" described above.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message