incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: cassandra 1.2.5- virtual nodes (num_token) pros/cons?
Date Thu, 15 Aug 2013 03:08:59 GMT
>   Repair and bootstrap will be limited by the node doing repair or bootstrap, since it
has to do the same amount of work whatever num_tokens is.
That's what I was thinking. 
I normally assume repair has very little data to stream, and most of the time is taken creating
the merkle trees. In that case   the node repair was started on still has to compact / hash
all it's data, however hashing the replicas would go faster is it's done on more machines.


Cheers
 
-----------------
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 13/08/2013, at 11:13 PM, Richard Low <richard@wentnet.com> wrote:

> On 13 August 2013 10:15, Alain RODRIGUEZ <arodrime@gmail.com> wrote:
> 
> Streaming from all the physical nodes in the cluster should make repair faster, for the
same reason it makes bootstrap faster. Shouldn't it ?
> 
> Virtual nodes doesn't speed up either very much.  Repair and bootstrap will be limited
by the node doing repair or bootstrap, since it has to do the same amount of work whatever
num_tokens is.  It places a more even load across the rest of the cluster though, since it
will repair with or bootstrap from all nodes in the cluster.  So the overall time will in
most cases be about the same.
> 
> The real speedup from vnodes comes when running removenode, when the streaming happens
in parallel across all nodes.
> 
> Richard.


Mime
View raw message