cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Branimir Lambov (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-12777) Optimize the vnode allocation for single replica per DC
Date Fri, 14 Oct 2016 09:02:20 GMT


Branimir Lambov commented on CASSANDRA-12777:

"Percent" implies 0-100 which isn't what you are using (rightly so). "Ratio" is a better term
for 0-1 multiplier.

How do you handle result above maximum in [the wraparound calculation|]?
For {{LongToken}} that's not a problem as the token gets wrapped on conversion to long, but
I don't think that happens for {{BigIntegerToken}}. This needs a test as well (including a
random one using {{a.size(split(a, b, x))}} within a couple of ulps from {{x * a.size(b)}},
also including a validity check for the returned token).

[{{createTokenInfo}} in constructor|]
appears superfluous.

Most of the comments in the original implementation add important information and should be
preserved (e.g. why [the + 2|]).

Did you try lower fractions than 0.99 for takeovers? I would go lower, perhaps 0.9 or even
0.75 (try the simulation out).

Nit: From a design perspective I believe it would be cleaner to leave the {{TokenAllocator}}
interface as interface, put the factory method there, and move the abstract base class to
a {{TokenAllocatorBase}}.

> Optimize the vnode allocation for single replica per DC
> -------------------------------------------------------
>                 Key: CASSANDRA-12777
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Dikang Gu
>            Assignee: Dikang Gu
>             Fix For: 3.x
> The new vnode allocation algorithm introduced in CASSANDRA-7032 is optimized for the
situation that there are multiple replicas per DC.
> In our production environment, most cluster only has one replica, in this case, the algorithm
does not work perfectly. It always tries to split token ranges by half, so that the ownership
of "min" node could go as low as ~60% compared to avg.
> So for single replica case, I'm working on a new algorithm, which is based on Branimir's
previous commit, to split token ranges by "some" percentage, instead of always by half. In
this way, we can get a very small variation of the ownership among different nodes.

This message was sent by Atlassian JIRA

View raw message