cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-9558) Cassandra-stress regression in 2.2
Date Fri, 26 Jun 2015 08:23:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14602554#comment-14602554
] 

Benedict edited comment on CASSANDRA-9558 at 6/26/15 8:22 AM:
--------------------------------------------------------------

Can we answer my question before forging ahead and changing any default pooling settings?
Like I say, it's not at all necessarily a *bug*. It is quite likely that this configuration
improves throughput for many normal cluster configurations, and has negative implications
only for very small clusters. We want the fewest connections we can get away with; perhaps,
the client should automatically scale the connections based on throughput or cluster size.

We haven't undertaken sufficient investigation to say with certainty, but it seems that what
we are doing here is increasing the CPU _overhead_ per operation in order to _saturate_ the
processing capacity of each box. However when there are more machines, or more simulated clients,
this increased overhead is highly likely to reduce throughput.

What we should probably do on our end is implement CASSANDRA-8466, since this is how a majority
of users really use their clusters: many clients, not one client with many connections in
the Java driver. In the meantime the proposed patch for ourselves is fine, but I want to avoid
us making the real world worse by changing the driver back.


was (Author: benedict):
Can we answer my question before forging ahead and changing any default pooling settings?
Like I say, it's not at all necessarily a *bug*. It is quite likely that this configuration
improves throughput for many normal cluster configurations, and has negative implications
only for very small clusters. We want the fewest connections we can get away with; perhaps,
the client should automatically scale the connections based on throughput or cluster size.

We haven't undertaken sufficient investigation to say with certainty, but it seems that what
we are doing here is increasing the CPU _overhead_ per operation in order to _saturate_ the
processing capacity of each box. However when there are more machines, or more simulated clients,
this increased overhead is highly likely to reduce throughput due to the increased overhead.

What we should probably do on our end is implement CASSANDRA-8466, since this is how a majority
of users really use their clusters: many clients, not one client with many connections in
the Java driver.

> Cassandra-stress regression in 2.2
> ----------------------------------
>
>                 Key: CASSANDRA-9558
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9558
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Alan Boudreault
>            Assignee: Andy Tolbert
>             Fix For: 2.2.0 rc2
>
>         Attachments: 2.1.log, 2.2.log, CASSANDRA-9558-2.patch, CASSANDRA-9558-ProtocolV2.patch,
atolber-CASSANDRA-9558-stress.tgz, atolber-trunk-driver-coalescing-disabled.txt, stress-2.1-java-driver-2.0.9.2.log,
stress-2.1-java-driver-2.2+PATCH.log, stress-2.1-java-driver-2.2.log, stress-2.2-java-driver-2.2+PATCH.log,
stress-2.2-java-driver-2.2.log
>
>
> We are seeing some regression in performance when using cassandra-stress 2.2. You can
see the difference at this url:
> http://riptano.github.io/cassandra_performance/graph_v5/graph.html?stats=stress_regression.json&metric=op_rate&operation=1_write&smoothing=1&show_aggregates=true&xmin=0&xmax=108.57&ymin=0&ymax=168147.1
> The cassandra version of the cluster doesn't seem to have any impact. 
> //cc [~tjake] [~benedict]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message