incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: cassandra 1.2.5- virtual nodes (num_token) pros/cons?
Date Mon, 29 Jul 2013 08:15:00 GMT
I would *strongly* recommend against upgrading from 0.8 directly to 1.2. Skipping a major version
is generally not recommended, skipped 3 would seem like carelessness. 

> I second Romain, do the upgrade and make sure the health is good first.
+1 but I would also recommend deciding if you actually need to use virtual nodes. The shuffle
process can take a long time and people have had mixed experiences with it. 

If you wanted to move to 1.2 and get vNodes I would consider spinning up a new cluster and
bulk loading into it. You could do an initial load and then to delta loads using snapshots,
there would however be a period of stale data in the new cluster until the last delta snapshot
is loaded. 

Cheers

-----------------
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 27/07/2013, at 3:36 AM, David McNelis <dmcnelis@gmail.com> wrote:

> I second Romain, do the upgrade and make sure the health is good first.
> 
> If you have or plan to have a large number of nodes, you might consider using fewer than
256 as your initial vnodes amount.  I think that number is inflated from reasonable in the
docs, as we've had some people talk about potential performance degradation if you have a
large number of nodes and a very high number of vnodes, if I had it to do over again, I'd
have done 64 vnodes as my default (across 20 nodes).
> 
> Another thing to be very cognizant of before shuffle is disk space.  You *must* have
less than 50% used in order to do the shuffle successfully because no data is removed (cleaned)
from a node during the shuffle process and the shuffle process essentially doubles the amount
of data until you're able to run a clean.
> 
> 
> On Fri, Jul 26, 2013 at 11:25 AM, Romain HARDOUIN <romain.hardouin@urssaf.fr> wrote:
> Vnodes are a great feature. More nodes are involved during operations such as bootstrap,
decommission, etc. 
> DataStax documentation is definitely a must read. 
> That said, If I were you, I'd wait somewhat before to shuffle the ring. I'd focus on
cluster upgrade and monitoring the nodes. (number of files handles, memory usage, latency,
etc). 
> Upgrading from 0.8 to 1.2 can be tricky, there are so many changes since then. Be careful
about compaction strategies you choose and double check the options. 
> 
> Regards, 
> Romain 
> 
> rash aroskar <rashmi.aroskar@gmail.com> a écrit sur 25/07/2013 23:25:11 :
> 
> > De : rash aroskar <rashmi.aroskar@gmail.com> 
> > A : user@cassandra.apache.org, 
> > Date : 25/07/2013 23:25 
> > Objet : cassandra 1.2.5- virtual nodes (num_token) pros/cons? 
> > 
> > Hi, 
> > I am upgrading my cassandra cluster from 0.8 to 1.2.5.  
> > In cassandra 1.2.5 the 'num_token' attribute confuses me.  
> > I understand that it distributes multiple tokens per node but I am 
> > not clear how that is helpful for performance or load balancing. Can
> > anyone elaborate? has anyone used this feature  and knows its 
> > advantages/disadvantages? 
> > 
> > Thanks, 
> > Rashmi
> 


Mime
View raw message