cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5932) Speculative read performance data show unexpected results
Date Thu, 26 Sep 2013 20:56:06 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779217#comment-13779217
] 

Jonathan Ellis commented on CASSANDRA-5932:
-------------------------------------------

The logic looks like this:

# Figure out how many replicas we need to contact to satisfy the desired consistencyLevel
+ Read Repair settings
# If that ends up being all the replicas, then use ASRE to get some redundancy on the data
reads.  This will allow the read to succeed even if a digest for RR times out.  Of course
if you are reading at CL.ALL and a replica times out there's nothing we can do.
# Otherwise, use SRE and make an "extra" request later, if it looks like one of the minimal
set isn't going to respond in time

Note that performing extra data requests does not affect handler.blockfor -- just makes it
possible for the request to proceed if it gets enough responses back, no matter which replicas
they come from.
                
> Speculative read performance data show unexpected results
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-5932
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5932
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Ryan McGuire
>            Assignee: Aleksey Yeschenko
>             Fix For: 2.0.2
>
>         Attachments: 5932.txt, compaction-makes-slow.png, compaction-makes-slow-stats.png,
eager-read-looks-promising.png, eager-read-looks-promising-stats.png, eager-read-not-consistent.png,
eager-read-not-consistent-stats.png, node-down-increase-performance.png
>
>
> I've done a series of stress tests with eager retries enabled that show undesirable behavior.
I'm grouping these behaviours into one ticket as they are most likely related.
> 1) Killing off a node in a 4 node cluster actually increases performance.
> 2) Compactions make nodes slow, even after the compaction is done.
> 3) Eager Reads tend to lessen the *immediate* performance impact of a node going down,
but not consistently.
> My Environment:
> 1 stress machine: node0
> 4 C* nodes: node4, node5, node6, node7
> My script:
> node0 writes some data: stress -d node4 -F 30000000 -n 30000000 -i 5 -l 2 -K 20
> node0 reads some data: stress -d node4 -n 30000000 -o read -i 5 -K 20
> h3. Examples:
> h5. A node going down increases performance:
> !node-down-increase-performance.png!
> [Data for this test here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> At 450s, I kill -9 one of the nodes. There is a brief decrease in performance as the
snitch adapts, but then it recovers... to even higher performance than before.
> h5. Compactions make nodes permanently slow:
> !compaction-makes-slow.png!
> !compaction-makes-slow-stats.png!
> The green and orange lines represent trials with eager retry enabled, they never recover
their op-rate from before the compaction as the red and blue lines do.
> [Data for this test here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.compaction.2.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> h5. Speculative Read tends to lessen the *immediate* impact:
> !eager-read-looks-promising.png!
> !eager-read-looks-promising-stats.png!
> This graph looked the most promising to me, the two trials with eager retry, the green
and orange line, at 450s showed the smallest dip in performance. 
> [Data for this test here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> h5. But not always:
> !eager-read-not-consistent.png!
> !eager-read-not-consistent-stats.png!
> This is a retrial with the same settings as above, yet the 95percentile eager retry (red
line) did poorly this time at 450s.
> [Data for this test here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.rc1.try2.json&metric=interval_op_rate&operation=stress-read&smoothing=1]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message