incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: cassandra 1.2.5- virtual nodes (num_token) pros/cons?
Date Tue, 06 Aug 2013 07:40:07 GMT
> The reason for me looking at virtual nodes is because of terrible experiences we had with
0.8 repairs and as per documentation (an logically) the virtual nodes seems like it will help
repairs being smoother. Is this true?
I've not thought too much about how they help repair run smoother, what was the documentation
you read ? 

> Also how to get the right number of virtual nodes?
Use the default 256


Hope that helps. 

 
-----------------
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/08/2013, at 7:39 AM, rash aroskar <rashmi.aroskar@gmail.com> wrote:

> Thanks for helpful responses. The upgrade from 0.8 to 1.2 is not direct, we have setup
test cluster where we did upgrade from 0.8 to 1.1 and then 1.2. Also we will do a whole different
cluster with 1.2, the 0.8 cluster will not be upgraded. But the data will be moved from 0.8
cluster to 1.2 cluster. 
> The reason for me looking at virtual nodes is because of terrible experiences we had
with 0.8 repairs and as per documentation (an logically) the virtual nodes seems like it will
help repairs being smoother. Is this true? Also how to get the right number of virtual nodes?
David suggested 64 vnodes for 20 machines. Is there a formula or a thought process to be followed
to get this number right?
> 
> 
> On Mon, Jul 29, 2013 at 4:15 AM, aaron morton <aaron@thelastpickle.com> wrote:
> I would *strongly* recommend against upgrading from 0.8 directly to 1.2. Skipping a major
version is generally not recommended, skipped 3 would seem like carelessness. 
> 
>> I second Romain, do the upgrade and make sure the health is good first.
> 
> +1 but I would also recommend deciding if you actually need to use virtual nodes. The
shuffle process can take a long time and people have had mixed experiences with it. 
> 
> If you wanted to move to 1.2 and get vNodes I would consider spinning up a new cluster
and bulk loading into it. You could do an initial load and then to delta loads using snapshots,
there would however be a period of stale data in the new cluster until the last delta snapshot
is loaded. 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 27/07/2013, at 3:36 AM, David McNelis <dmcnelis@gmail.com> wrote:
> 
>> I second Romain, do the upgrade and make sure the health is good first.
>> 
>> If you have or plan to have a large number of nodes, you might consider using fewer
than 256 as your initial vnodes amount.  I think that number is inflated from reasonable in
the docs, as we've had some people talk about potential performance degradation if you have
a large number of nodes and a very high number of vnodes, if I had it to do over again, I'd
have done 64 vnodes as my default (across 20 nodes).
>> 
>> Another thing to be very cognizant of before shuffle is disk space.  You *must* have
less than 50% used in order to do the shuffle successfully because no data is removed (cleaned)
from a node during the shuffle process and the shuffle process essentially doubles the amount
of data until you're able to run a clean.
>> 
>> 
>> On Fri, Jul 26, 2013 at 11:25 AM, Romain HARDOUIN <romain.hardouin@urssaf.fr>
wrote:
>> Vnodes are a great feature. More nodes are involved during operations such as bootstrap,
decommission, etc. 
>> DataStax documentation is definitely a must read. 
>> That said, If I were you, I'd wait somewhat before to shuffle the ring. I'd focus
on cluster upgrade and monitoring the nodes. (number of files handles, memory usage, latency,
etc). 
>> Upgrading from 0.8 to 1.2 can be tricky, there are so many changes since then. Be
careful about compaction strategies you choose and double check the options. 
>> 
>> Regards, 
>> Romain 
>> 
>> rash aroskar <rashmi.aroskar@gmail.com> a écrit sur 25/07/2013 23:25:11 :
>> 
>> > De : rash aroskar <rashmi.aroskar@gmail.com> 
>> > A : user@cassandra.apache.org, 
>> > Date : 25/07/2013 23:25 
>> > Objet : cassandra 1.2.5- virtual nodes (num_token) pros/cons? 
>> > 
>> > Hi, 
>> > I am upgrading my cassandra cluster from 0.8 to 1.2.5.  
>> > In cassandra 1.2.5 the 'num_token' attribute confuses me.  
>> > I understand that it distributes multiple tokens per node but I am 
>> > not clear how that is helpful for performance or load balancing. Can
>> > anyone elaborate? has anyone used this feature  and knows its 
>> > advantages/disadvantages? 
>> > 
>> > Thanks, 
>> > Rashmi
>> 
> 
> 


Mime
View raw message