cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Lohfink <clohfin...@gmail.com>
Subject Re: reduced num_token = improved performance ??
Date Wed, 12 Jul 2017 07:45:07 GMT
Probably worth mentioning that some operational procedures like repairs,
bootstrapping etc are helped massively by using less tokens. Incremental
repairs are one of the things I would say is most impacted the by it since
less tokens will mean less local ranges to iterate through and less anti
compaction. I would highly recommend using far less than 256 in 3.x.

Chris

On Tue, Jul 11, 2017 at 8:36 PM, Justin Cameron <justin@instaclustr.com>
wrote:

> Hi,
>
> Using fewer vnodes means you'll have a higher chance of hot spots in your
> cluster. Hot spots in Cassandra are nodes that, by random chance, are
> responsible for a higher percentage of the token space than others. This
> means they will receive more data and also more traffic/load than other
> nodes in the cluster.
>
> CASSANDRA-7032 goes a long way towards addresses this issue by allocating
> vnode tokens more intelligently, rather than just randomly assigning them.
> If you're using a version of Cassandra that contains this feature (3.0+),
> you can use a smaller number of vnodes in your cluster.
>
> A high number of vnodes won't affect performance for most Cassandra
> workloads, but if you're running tasks that need to do token-range scans
> (such as Spark), there is usually a significant performance hit.
>
> If you're on C* 3.0+ and are using Spark (or similar workloads - cassandra
> lucene index plugin is also affected) then I'd recommend using fewer vnodes
> - 16 would be ok. You'll probably still see some variance in token-space
> ownership between nodes, but the trade-off for better Spark performance
> will likely be worth it.
>
> Justin
>
> On Wed, 12 Jul 2017 at 00:34 ZAIDI, ASAD A <az192g@att.com> wrote:
>
>> Hi Folks,
>>
>>
>>
>> Pardon me if I’m missing  something obvious.  I’m still using
>> apache-cassandra 2.2 and planning for upgrade to  3.x.
>>
>> I came across this jira [https://issues.apache.org/
>> jira/browse/CASSANDRA-7032] that suggests reducing num_token may improve
>> general performance of Cassandra like having  num_token=16 instead of 256
>>   may help!
>>
>>
>>
>> Can you please suggests if having less num_token would provide real
>> performance benefits or if  it comes with any downsides that we should also
>> consider? I’ll much appreciate your insights.
>>
>>
>>
>> Thank you
>>
>> Asad
>>
> --
>
>
> *Justin Cameron*Senior Software Engineer
>
>
> <https://www.instaclustr.com/>
>
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>

Mime
View raw message