cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Walsh, Stephen" <Stephen.Wa...@Aspect.com>
Subject Re: Balancing tokens over 2 datacenter
Date Thu, 14 Apr 2016 12:13:59 GMT
Hi Alain,

If you look below (chain is getting long I know) but I mentioned that we are indeed using
DCAwareRoundRobinPolicy

"We use the DCAwareRoundRobinPolicy in our java datastax driver in each DC application to
point to that Cassandra DC’s."

Indeed it is a trade off having all data over all nodes, but this is to allow, one DC to go
down or 2 nodes now in a single DC.
Just to insure maximum up time.

Im afraid the that all application are all reading from DC1, despite having a preferred read
of DC2.
I believe this is because the primary tokens where created in DC1 - due to an initial miss-configuration
when our application where first started and only used DC1 to create the keyspaces ad tables

Steve


From: Alain RODRIGUEZ <arodrime@gmail.com<mailto:arodrime@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Thursday, 14 April 2016 at 12:57
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Balancing tokens over 2 datacenter

100% ownership on all nodes isn’t wrong with 3 nodes in each of 2 Dcs with RF=3 in both
of those Dcs. That’s exactly what you’d expect it to be, and a perfectly viable production
config for many workloads.

+1, no doubt about it. The only thing is all the nodes own the exact same data, meaning the
data is replicated 6 times, once in each the 6 machines. Data is expensive but quite safe
there, that's a tradeoff to consider, but it is ok from a Cassandra point of view, nothing
"wrong" there.


We see all the writes are balanced (due to the replication factor) but all reads only go to
DC1.
So with the configuration I believed confirmed :)

Any way to balance the primary tokens over the two DC’s? :)


Steve, I thought it was now ok.

Could you confirm this?

Are you using something like 'new DCAwareRoundRobinPolicy("DC1"));' as pointed in Bhuvan's
link http://stackoverflow.com/questions/22813045/ability-to-write-to-a-particular-cassandra-node
? You can use some other

Then make sure to deploy this on clients on that need to use 'DC1' and 'new DCAwareRoundRobinPolicy("DC2")'
on client that should be using 'DC2'.

Are your client using the 'DCAwareRoundRobinPolicy' and are the clients from the datacenter
related to DC2, using 'new DCAwareRoundRobinPolicy("DC2")'?

This is really the only thing I can think about right now...

C*heers,
-----------------------
Alain Rodriguez - alain@thelastpickle.com<mailto:alain@thelastpickle.com>
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-04-14 11:43 GMT+02:00 Walsh, Stephen <Stephen.Walsh@aspect.com<mailto:Stephen.Walsh@aspect.com>>:
Thanks Guys,

I tend to agree that its a viable configuration, (but I’m biased)
We use datadog monitoring to view read writes per node,

We see all the writes are balanced (due to the replication factor) but all reads only go to
DC1.
So with the configuration I believed confirmed :)

Any way to balance the primary tokens over the two DC’s? :)

Steve

From: Jeff Jirsa <jeff.jirsa@crowdstrike.com<mailto:jeff.jirsa@crowdstrike.com>>
Reply-To: <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Thursday, 14 April 2016 at 03:05

To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Balancing tokens over 2 datacenter

100% ownership on all nodes isn’t wrong with 3 nodes in each of 2 Dcs with RF=3 in both
of those Dcs. That’s exactly what you’d expect it to be, and a perfectly viable production
config for many workloads.



From: Anuj Wadehra
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
Date: Wednesday, April 13, 2016 at 6:02 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
Subject: Re: Balancing tokens over 2 datacenter

Hi Stephen Walsh,

As per the nodetool output, every node owns 100% of the range. This indicates wrong configuration.
It would be good, if you verify and share following properties of yaml on all nodes:

Num tokens,seeds, cluster name,listen address, initial token.

Also, which snitch are you using? If you use propertyfilesnitch, please share cassandra-topology.properties
too.



Thanks
Anuj

Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, 13 Apr, 2016 at 9:46 PM, Walsh, Stephen
<Stephen.Walsh@Aspect.com<mailto:Stephen.Walsh@Aspect.com>> wrote:
Right again Alain
We use the DCAwareRoundRobinPolicy in our java datastax driver in each DC application to point
to that Cassandra DC’s.



From: Alain RODRIGUEZ <arodrime@gmail.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Wednesday, 13 April 2016 at 15:52
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Re: Balancing tokens over 2 datacenter

Steve,

This cluster looks just great.

Now, due to a miss configuration in our application, we saw that our application in both DC’s
where pointing to DC1.

This is the only thing to solve, and it happens in the client side configuration.

What client do you use?

Are you using something like 'new DCAwareRoundRobinPolicy("DC1"));' as pointed in Bhuvan's
link http://stackoverflow.com/questions/22813045/ability-to-write-to-a-particular-cassandra-node
? You can use some other

Then make sure to deploy this on clients on that need to use 'DC1' and 'new DCAwareRoundRobinPolicy("DC2")'
on client that should be using 'DC2'.

Make sure ports are open.

This should be it,

C*heers,
-----------------------
Alain Rodriguez - alain@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com



2016-04-13 16:28 GMT+02:00 Walsh, Stephen <Stephen.Walsh@aspect.com>:
Thanks for your helps guys,

As you guessed our schema is

{'class': 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3'}  AND durable_writes = false;


Our reads and writes on LOCAL_ONE with each application (now) using its own DC as its preferred
DC

Here is the nodetool status for one of our tables (all tabes are created the same way)


Datacenter: DC1

===============

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address     Load       Tokens  Owns (effective)  Host ID                             
 Rack

UN  X.0.0.149  14.6 MB    256     100.0%            0f497235-a0bb-4e47-9434-dd0e126aa432 
RAC3

UN  X.0.0.251  12.33 MB   256     100.0%            a1307717-4b61-4d57-8658-50460d6d54a1 
RAC1

UN  X.0.0.79   21.54 MB   256     100.0%            f353c8f3-6b7c-483b-ad9a-3d66d469079e 
RAC2

Datacenter: DC2

===============

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address     Load       Tokens  Owns (effective)  Host ID                             
 Rack

UN  X.0.2.32   18.08 MB   256     100.0%            103a1cb3-6580-44bd-bf97-28ae160e1119 
RAC6

UN  X.0.2.211  12.46 MB   256     100.0%            8c8dd5ba-806d-43eb-9ee5-af463e443f46 
RAC5

UN  X.0.2.186  12.58 MB   256     100.0%            aef904ba-aaab-47f1-9bdc-cc1e0c676f61 
RAC4


We ran the nodetool repair and cleanup in case the nodes where balanced but needed cleaning
up – this was not the case :(


Steve


From: Alain RODRIGUEZ <arodrime@gmail.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Wednesday, 13 April 2016 at 14:48
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Re: Balancing tokens over 2 datacenter

Hi Steve,

As such, all keyspaces and tables where created on DC1.
The effect of this is that all reads are now going to DC1 and ignoring DC2

I think this is not exactly true. When tables are created, they are created on a specific
keyspace, no matter where you send the alter schema command, schema will propagate to all
the datacenters the keyspace is replicated to.

So the question is: Is your keyspace using 'DC1: 3, DC2: 3' as replication factors? Could
you show us the schema and a nodetool status as well?

WE’ve tried doing , nodetool repair / cleanup – but the reads always go to DC1

Trying to do random things is often not a good idea. For example, as each node holds 100%
of the data, cleanup is an expensive no-op :-).

Anyone know how to rebalance the tokens over DC’s?

Yes, I can help on that, but I need to know your current status.

Basically, your(s) keyspace(s) must be using RF of 3 on the 2 DCs as mentioned, your client
to be configured to stick to the DC in their zone (use a DCAware policy with the DC name +
LOCAL_ONE/QUORUM, see Bhuvan's links) and things should be better.

If you need more detailed help, let us know what is unclear to you and provide us with 'nodetool
status' output and with your schema (at least keyspaces config).

C*heers,
-----------------------
Alain Rodriguez - alain@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com







2016-04-13 15:32 GMT+02:00 Bhuvan Rawal <bhu1rawal@gmail.com>:
This could be because of the way you have configured the policy, have a look at the below
links for configuring the policy

https://datastax.github.io/python-driver/api/cassandra/policies.html

http://stackoverflow.com/questions/22813045/ability-to-write-to-a-particular-cassandra-node

Regards,
Bhuvan

On Wed, Apr 13, 2016 at 6:54 PM, Walsh, Stephen <Stephen.Walsh@aspect.com> wrote:
Hi there,

So we have 2 datacenter with 3 nodes each.
Replication factor is 3 per DC (so each node has all data)

We have an application in each DC that writes that Cassandra DC.

Now, due to a miss configuration in our application, we saw that our application in both DC’s
where pointing to DC1.

As such, all keyspaces and tables where created on DC1.
The effect of this is that all reads are now going to DC1 and ignoring DC2

WE’ve tried doing , nodetool repair / cleanup – but the reads always go to DC1?

Anyone know how to rebalance the tokens over DC’s?


Regards
Steve


P.S I know about this article
http://www.datastax.com/dev/blog/balancing-your-cassandra-cluster
But its doesn’t answer my question regarding 2 DC’s token balancing

This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain
information that is confidential. If you have received this message in error, please do not
read, copy or forward this message. Please notify the sender immediately, delete it from your
system and destroy any copies. You may not further disclose or distribute this email or its
attachments.


This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain
information that is confidential. If you have received this message in error, please do not
read, copy or forward this message. Please notify the sender immediately, delete it from your
system and destroy any copies. You may not further disclose or distribute this email or its
attachments.

This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain
information that is confidential. If you have received this message in error, please do not
read, copy or forward this message. Please notify the sender immediately, delete it from your
system and destroy any copies. You may not further disclose or distribute this email or its
attachments.
This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain
information that is confidential. If you have received this message in error, please do not
read, copy or forward this message. Please notify the sender immediately, delete it from your
system and destroy any copies. You may not further disclose or distribute this email or its
attachments.

This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain
information that is confidential. If you have received this message in error, please do not
read, copy or forward this message. Please notify the sender immediately, delete it from your
system and destroy any copies. You may not further disclose or distribute this email or its
attachments.
Mime
View raw message