cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Lowe <richard.l...@arkivum.com>
Subject RE: How to come up with a predefined topology
Date Thu, 12 Jul 2012 00:03:27 GMT
Using PropertyFileSnitch you can fine tune the topology of the cluster. 

What you tell Cassandra about your "DC" and "rack" doesn't have to match how they are in real
life. You can create virtual DCs for Cassandra and even treat each node as a separate rack.

For example, in cassandra-topology.properties:

# Format is <Node IP>=<DC Name>:<Rack Name>
192.168.0.11=DC1_realtime:node_1
192.168.0.12=DC1_realtime:node_2
192.168.0.13=DC1_analytics:node_3
192.168.1.11=DC2_realtime:node_1

If you then specify the parameters for the keyspace to use these, you can control exactly
which set of nodes replicas end up on. 

For example, in cassandra-cli:

create keyspace ks1 with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy'
and strategy_options = { DC1_realtime: 2, DC1_analytics: 1, DC2_realtime: 1 };

As far as I know there isn't any way to use the rack name in the strategy_options for a keyspace.
You might want to look at the code to dig into that, perhaps.

Whichever snitch you use, the nodes are sorted in order of proximity to the client node. How
this is determined depends on the snitch that's used but most (the ones that ship with Cassandra)
will use the default ordering of same-node < same-rack < same-datacenter < different-datacenter.
Each snitch has methods to tell Cassandra which rack and DC a node is in, so it always knows
which node is closest. Used with the Bloom filters this can tell us where the nearest replica
is.



-----Original Message-----
From: prasenjit mukherjee [mailto:prasen.bea@gmail.com] 
Sent: 11 July 2012 06:33
To: user
Subject: How to come up with a predefined topology

Quoting from http://www.datastax.com/docs/0.8/cluster_architecture/replication#networktopologystrategy
:

"Asymmetrical replication groupings are also possible depending on your use case. For example,
you may want to have three replicas per data center to serve real-time application requests,
and then have a single replica in a separate data center designated to running analytics."

Have 2 questions :
1. Any example how to configure a topology with 3 replicas in one DC ( with 2 in 1 rack +
1 in another rack ) and one replica in another DC ?
 The default networktopologystrategy with rackinferringsnitch will only give me equal distribution
( 2+2 )

2. I am assuming the reads can go to any of the replicas. Is there a client which will send
query to a node ( in cassandra ring ) which is closest to the client ?

-Thanks,
Prasenjit



Mime
View raw message