incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David McNelis <dmcne...@gmail.com>
Subject Re: Node tokens / data move
Date Tue, 16 Jul 2013 13:09:00 GMT
Eric,

Unfortunately if you've got a non-vnode cluster and are trying to convert,
you are likely going to at least want, if not have to, run shuffle.  It
isn't a pleasant situation when you run into that because in order for the
shuffle to execute safely and successfully you need to have essentially
2x's the disk space as the size of your data at a minimum because no nodes
are removed from the original node until a cleanup is called.

This also means that other cass operations that cause disk usage to ballon
are competing for space with the shuffle process and boom, like that, out
of disk space.

David


On Tue, Jul 16, 2013 at 8:35 AM, Eric Stevens <mightye@gmail.com> wrote:

> vnodes currently do not brings any noticeable benefits to outweight trouble
>
>
> The main advantage of vnodes is that it lets you have flexibility with
> respect to adding and removing nodes from your cluster without having to
> rebalance your cluster (issuing a lot of moves).  A shuffle is a lot of
> moves too, so the only reason to do this is if you've discovered that the
> random allocation of vnodes has managed to leave an unbalanced cluster (one
> node either has an unusually large load, or an unusually small one).
>
> Shuffling isn't something you should be doing as a matter of course, and
> with 256 vnodes per server and 3+ servers, the chances of a brokenly
> unbalanced cluster is more likely the cause of bad data partitioning
> (extremely unbalanced row sizes) than of vnode token placement.  With
> unbalanced rows, nothing will solve balance problems except finding a
> different way to partition your data across rows.
>
>
>
> On Mon, Jul 15, 2013 at 7:10 AM, Radim Kolar <hsn@filez.com> wrote:
>
>>
>>  My understanding is that it is not possible to change the number of
>>> tokens after the node has been initialized.
>>>
>> that was my conclusion too. vnodes currently do not brings any noticeable
>> benefits to outweight trouble. shuffle is very slow in large cluster.
>> Recovery is faster with vnodes but i have very few node failures per year.
>>
>
>

Mime
View raw message