cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Norberg (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-5422) Binary protocol sanity check
Date Tue, 07 May 2013 19:57:15 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651238#comment-13651238
] 

Daniel Norberg edited comment on CASSANDRA-5422 at 5/7/13 7:55 PM:
-------------------------------------------------------------------

The main issues I identified:

* Contention in the driver, i.e. per connection locks taken for every request
* Expensive serialization, i.e. multiple layers of ChannelBuffers used in the ExecuteMessage
codec.
* No write batching, i.e. every message results in an expensive syscall.
* Contention in the stress application, bottlenecking on a shared work queue and spawning
of one thread per asynchronous worker.

After eliminating contention in the driver and the stress application, optimizing serialization
and adding write batching I get a throughput of 200k+ requests per second on my laptop (four
core 2Ghz i7 mpb) when making asynchronous requests at a concurrency level of 500. This is
with request execution and mutation disabled with the above patch and running both cassandra
and the stress tool with Java 7. With this throughput, the benchmark uses a bandwidth of ~60
MB/sec so server grade hardware should be able to saturate 1 Gbit ethernet interfaces, especially
with larger payloads.

https://github.com/danielnorberg/java-driver/tree/optimization
https://github.com/danielnorberg/cassandra/tree/transport-benchmark

\\
{noformat}
5/7/13 2:59:54 PM ==============================================================
com.datastax.driver.stress.Reporter:
  latencies:
             count = 352558280
         mean rate = 230848.13 calls/s
     1-minute rate = 223475.90 calls/s
     5-minute rate = 224159.41 calls/s
    15-minute rate = 190931.94 calls/s
               min = 0.27ms
               max = 124.37ms
              mean = 2.16ms
            stddev = 1.63ms
            median = 1.69ms
              75% <= 2.43ms
              95% <= 5.57ms
              98% <= 6.64ms
              99% <= 8.76ms
            99.9% <= 26.57ms

  requests:
             count = 352559217
         mean rate = 230848.12 requests/s
     1-minute rate = 223474.50 requests/s
     5-minute rate = 224159.75 requests/s
    15-minute rate = 190950.27 requests/s
{noformat}

Suggestions for further work:

* Use uniform histogram instead of biased (default) as the biased histogram takes expensive
read-write locks for every update, i.e. every request. Or find some way to eliminate the read-write
locking in the biased histogram.
* Make StorageProxy non-blocking and use the jsr166e ForkJoinPool instead of normal TPE for
a nice throughput boost when working with a large volume of small messages.
* Change protocol to allow more than 128 outstanding requests per connection.

When running normally with request execution enabled I get ~24k rps. Quick profiling indicates
that there's some contention points that could be removed, e.g. the ReentrantReadWriteLock
(switchLock) in Table. We should be able to optimize the whole stack to the point where a
cassandra node can achieve a sustained rate of 100k+ writes per second.


                
      was (Author: danielnorberg):
    The main issues I identified:

* Contention in the driver, i.e. per connection locks taken for every request
* Expensive serialization, i.e. multiple layers of ChannelBuffers used in the ExecuteMessage
codec.
* No write batching, i.e. every message results in an expensive syscall.
* Contention in the stress application, bottlenecking on a shared work queue and spawning
of one thread per asynchronous worker.

After eliminating contention in the driver and the stress application, optimizing serialization
and adding write batching I get a throughput of 200k+ requests per second on my laptop (four
core 2Ghz i7 mpb) when making asynchronous requests at a concurrency level of 500. This is
with request execution and mutation disabled with the above patch and running both cassandra
and the stress tool with Java 7. With this throughput, the benchmark uses a bandwidth of ~60
MB/sec so server grade hardware should be able to saturate 1 Gbit ethernet interfaces, especially
with larger payloads.

https://github.com/danielnorberg/java-driver/tree/optimization
https://github.com/danielnorberg/cassandra/tree/transport-benchmark


{noformat}
5/7/13 2:59:54 PM ==============================================================
com.datastax.driver.stress.Reporter:
  latencies:
             count = 352558280
         mean rate = 230848.13 calls/s
     1-minute rate = 223475.90 calls/s
     5-minute rate = 224159.41 calls/s
    15-minute rate = 190931.94 calls/s
               min = 0.27ms
               max = 124.37ms
              mean = 2.16ms
            stddev = 1.63ms
            median = 1.69ms
              75% <= 2.43ms
              95% <= 5.57ms
              98% <= 6.64ms
              99% <= 8.76ms
            99.9% <= 26.57ms

  requests:
             count = 352559217
         mean rate = 230848.12 requests/s
     1-minute rate = 223474.50 requests/s
     5-minute rate = 224159.75 requests/s
    15-minute rate = 190950.27 requests/s
{noformat}

Suggestions for further work:

* Use uniform histogram instead of biased (default) as the biased histogram takes expensive
read-write locks for every update, i.e. every request. Or find some way to eliminate the read-write
locking in the biased histogram.
* Make StorageProxy non-blocking and use the jsr166e ForkJoinPool instead of normal TPE for
a nice throughput boost when working with a large volume of small messages.
* Change protocol to allow more than 128 outstanding requests per connection.

When running normally with request execution enabled I get ~24k rps. Quick profiling indicates
that there's some contention points that could be removed, e.g. the ReentrantReadWriteLock
(switchLock) in Table. We should be able to optimize the whole stack to the point where a
cassandra node can achieve a sustained rate of 100k+ writes per second.


                  
> Binary protocol sanity check
> ----------------------------
>
>                 Key: CASSANDRA-5422
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5422
>             Project: Cassandra
>          Issue Type: Bug
>          Components: API
>            Reporter: Jonathan Ellis
>            Assignee: Daniel Norberg
>         Attachments: 5422-test.txt
>
>
> With MutationStatement.execute turned into a no-op, I only get about 33k insert_prepared
ops/s on my laptop.  That is: this is an upper bound for our performance if Cassandra were
infinitely fast, limited by netty handling the protocol + connections.
> This is up from about 13k/s with MS.execute running normally.
> ~40% overhead from netty seems awfully high to me, especially for insert_prepared where
the return value is tiny.  (I also used 4-byte column values to minimize that part as well.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message