cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Wee <peich...@gmail.com>
Subject Re: distribution of token ranges with virtual nodes
Date Thu, 10 Jan 2013 02:05:35 GMT
It should be in the trunk, check it
https://github.com/apache/cassandra/blob/trunk/bin/cassandra-shuffle


On Thu, Jan 10, 2013 at 1:18 AM, Manu Zhang <owenzhang1990@gmail.com> wrote:

> Is cassandra-shuffle command in the trunk? Or it is only included in the
> Debian package? I don't find it in the trunk.
>
>
> On Sat, Nov 3, 2012 at 2:18 AM, Eric Evans <eevans@acunu.com> wrote:
>
>> On Fri, Nov 2, 2012 at 12:38 AM, Manu Zhang <owenzhang1990@gmail.com>
>> wrote:
>> >> It splits into a contiguous range, because truly upgrading to vnode
>> >> functionality is another step.
>> >
>> > That confuses me. As I understand it, there is no point in having 256
>> tokens
>> > on same node if I don't commit the shuffle
>>
>> This isn't exactly true.  By-partition operations (think repair,
>> streaming, etc) will be more reliable in the sense that if they fail
>> and need to be restarted, there is less that is lost/needs redoing.
>> Also, if all you did was migrate from 1-token-per-node to 256
>> contiguous tokens per node, normal topology changes (bootstrapping new
>> nodes, decommissioning old ones), would gradually work to redistribute
>> the partitions.  And, from a topology perspective, splitting the one
>> partition into many contiguous partition is a no-op; it's safe to do
>> and there is no cost to speak of from a computational or IO
>> perspective.
>>
>> On the other hand, shuffling requires moving tokens around the
>> cluster.  If you completely randomize placement, it follows that you
>> will need to relocate all of the clusters data, so it's quite costly.
>> It's also precedent setting, and not thoroughly tested yet.
>>
>> --
>> Eric Evans
>> Acunu | http://www.acunu.com | @acunu
>>
>
>

Mime
View raw message