cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carlos Rolo <r...@pythian.com>
Subject Re: Replication to second data center with different number of nodes
Date Mon, 30 Mar 2015 06:47:13 GMT
Sharing my experience here.

1) Never had any issues with different size DCs. If the hardware is the
same, keep the # to 256.
2) In most of the cases I keep the 256 vnodes and no performance problems
(when they are triggered, the cause is not the vnodes #)

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
<http://linkedin.com/in/carlosjuzarterolo>*
Tel: 1649
www.pythian.com

On Mon, Mar 30, 2015 at 6:31 AM, Anishek Agarwal <anishek@gmail.com> wrote:

> Colin,
>
> When you said larger number of tokens has Query performance hit, is it
> read or write performance. Also if you have any links you could share to
> shed some light on this it would be great.
>
> Thanks
> Anishek
>
> On Sun, Mar 29, 2015 at 2:20 AM, Colin Clark <colin@clark.ws> wrote:
>
>> I typically use a # a lot lower than 256, usually less than 20 for
>> num_tokens as a larger number has historically had a dramatic impact on
>> query performance.
>> —
>> Colin Clark
>> colin@clark.ws
>> +1 612-859-6129
>> skype colin.p.clark
>>
>> On Mar 28, 2015, at 3:46 PM, Eric Stevens <mightye@gmail.com> wrote:
>>
>> If you're curious about how Cassandra knows how to replicate data in the
>> remote DC, it's the same as in the local DC, replication is independent in
>> each, and you can even set a different replication strategy per keyspace
>> per datacenter.  Nodes in each DC take up num_tokens positions on a ring,
>> each partition key is mapped to a position on that ring, and whomever owns
>> that part of the ring is the primary for that data.  Then (oversimplified)
>> r-1 adjacent nodes become replicas for that same data.
>>
>> On Fri, Mar 27, 2015 at 6:55 AM, Sibbald, Charles <
>> Charles.Sibbald@bskyb.com> wrote:
>>
>>>
>>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__num_tokens
>>>
>>>  So go with a default 256, and leave initial token empty:
>>>
>>>  num_tokens: 256
>>>
>>> # initial_token:
>>>
>>>
>>>  Cassandra will always give each node the same number of tokens, the
>>> only time you might want to distribute this is if your instances are of
>>> different sizing/capability which is also a bad scenario.
>>>
>>>   From: Björn Hachmann <bjoern.hachmann@metrigo.de>
>>> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>> Date: Friday, 27 March 2015 12:11
>>> To: user <user@cassandra.apache.org>
>>> Subject: Re: Replication to second data center with different number of
>>> nodes
>>>
>>>
>>> 2015-03-27 11:58 GMT+01:00 Sibbald, Charles <Charles.Sibbald@bskyb.com>:
>>>
>>>> Cassandra’s Vnodes config
>>>
>>>
>>> ​Thank you. Yes, we are using vnodes! The num_token parameter controls
>>> the number of vnodes assigned to a specific node.​
>>>
>>>  Might be I am seeing problems where are none.
>>>
>>>  Let me rephrase my question: How does Cassandra know it has to
>>> replicate 1/3 of all keys to each single node in the second DC? I can see
>>> two ways:
>>>  1. It has to be configured explicitly.
>>>  2. It is derived from the number of nodes available in the data center
>>> at the time `nodetool rebuild` is started.
>>>
>>>  Kind regards
>>> Björn
>>>   Information in this email including any attachments may be
>>> privileged, confidential and is intended exclusively for the addressee. The
>>> views expressed may not be official policy, but the personal views of the
>>> originator. If you have received it in error, please notify the sender by
>>> return e-mail and delete it from your system. You should not reproduce,
>>> distribute, store, retransmit, use or disclose its contents to anyone.
>>> Please note we reserve the right to monitor all e-mail communication
>>> through our internal and external networks. SKY and the SKY marks are
>>> trademarks of Sky plc and Sky International AG and are used under licence.
>>> Sky UK Limited (Registration No. 2906991), Sky-In-Home Service Limited
>>> (Registration No. 2067075) and Sky Subscribers Services Limited
>>> (Registration No. 2340150) are direct or indirect subsidiaries of Sky plc
>>> (Registration No. 2247735). All of the companies mentioned in this
>>> paragraph are incorporated in England and Wales and share the same
>>> registered office at Grant Way, Isleworth, Middlesex TW7 5QD.
>>>
>>
>>
>>
>

-- 


--




Mime
View raw message