cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kyrylo Lebediev <Kyrylo_Lebed...@epam.com.INVALID>
Subject Re: Performance impact of using NetworkTopology with 3 node cassandra cluster in One DC
Date Thu, 02 Aug 2018 11:54:44 GMT
There are two factors in terms of Cassandra that determine what's called network topology:
datacenter and rack.

rack - it's not necessarily a physical rack, it's rather a single point of failure. For example,
in case of AWS one availability zone is usually chosen to be a Cassandra rack.

datacenter - is a set of racks between which we have good network connection and low latency.
Usually for AWS it's a region.


If you use NetworkTopologyStrategy + properly configured snitch, network topology is taken
into account during replica placement: there won't be more than 1 replica of a data chunk
in a rack. This means that if a whole rack fails (for example AWS AZ goes offline), there
are still 2 other replicas online for each chunk of data (in case RF=3) and queries with CL=QUORUM
are still working.

In order to avoid data imbalance between all the nodes (which may cause "hot spots" in your
cluster = performance impact), all racks should have the same number of nodes with approximately
the same capacity.


Also, sometimes CL=QUORUM isn't used correctly and CL=LOCAL_QUORUM should be used instead.
There are no differences between the two in case of one DC, but in case of two and more DC's
the former leads to cross-DC communication, as majority of all replicas across all DC's should
be queried. This obviously leads to increased latencies. The same is true, for example, for
ONE vs LOCAL_ONE.


If you take a look at the manual how to add  DC to a cluster you'll all find cautions about
QUORUM/LOCAL_QUORUM there during the operation. The reason is when data which is supposed
to be in the new DC isn't already there (as streaming is in progress and hasn't completed
yet), it will cause blocking read repairs.


Regards,

Kyrill


________________________________
From: Murtaza Talwari <mdt_100@hotmail.com>
Sent: Thursday, August 2, 2018 1:22:16 PM
To: user@cassandra.apache.org
Subject: Performance impact of using NetworkTopology with 3 node cassandra cluster in One
DC


We are using 3 node Cassandra cluster in one data center.



For our keyspaces as suggested in best practices we are using NetworkTopology for replication
strategy using the GossipingPropertyFileSnitch.

For Read/Write consistency we are using as QUORUM.



In majority of cases when users use NetworkTopology as replication strategy they might have
multiple DataCenters configured.

In our case we have only one DataCenter,



  *   With that using the NetworkTopology as replication strategy will it cause any performance
impact ?
  *   As we are using QUORUM as Read/Write consistency which is considering multiple DataCenters,
does QUORUM consistency have any performance impact ? is it OK to continue using QUORUM consistency
considering future expansions of data centers ?



Please suggest.



Regards,


Mime
View raw message