cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Burroughs (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-6465) DES scores fluctuate too much for cache pinning
Date Mon, 09 Dec 2013 18:24:07 GMT
Chris Burroughs created CASSANDRA-6465:
------------------------------------------

             Summary: DES scores fluctuate too much for cache pinning
                 Key: CASSANDRA-6465
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6465
             Project: Cassandra
          Issue Type: Bug
          Components: Core
         Environment: 1.2.11, 2 DC cluster
            Reporter: Chris Burroughs


To quote the conf:

{noformat}
# if set greater than zero and read_repair_chance is < 1.0, this will allow
# 'pinning' of replicas to hosts in order to increase cache capacity.
# The badness threshold will control how much worse the pinned host has to be
# before the dynamic snitch will prefer other replicas over it.  This is
# expressed as a double which represents a percentage.  Thus, a value of
# 0.2 means Cassandra would continue to prefer the static snitch values
# until the pinned host was 20% worse than the fastest.
dynamic_snitch_badness_threshold: 0.1
{noformat}

An assumption of this feature is that scores will vary by less than dynamic_snitch_badness_threshold
during normal operations.  Attached is the result of polling a node for the scores of 6 different
endpoints at 1 Hz for 15 minutes.  The endpoints to sample were chosen with `nodetool getendpoints`
for row that is known to get reads.  The node was acting as a coordinator for a few hundred
req/second, so it should have sufficient data to work with.  Other traces on a second cluster
have produced similar results.
 * The scores vary by far more than I would expect, as show by the difficulty of seeing anything
useful in that graph.
 * The difference between the best and next-best score is usually > 10% (default dynamic_snitch_badness_threshold).

Neither ClientRequest nor ColumFamily metrics showed wild changes during the data gathering
period.

Attachments:
 * jython script cobbled together to gather the data (based on work on the mailing list from
Maki Watanabe a while back)
 * csv of DES scores for 6 endpoints, polled about once a second
 * Attempt at making a graph





--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message