cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From prasenjit mukherjee <prasen....@gmail.com>
Subject Re: How to come up with a predefined topology
Date Fri, 13 Jul 2012 04:24:39 GMT
On Fri, Jul 13, 2012 at 4:04 AM, aaron morton <aaron@thelastpickle.com> wrote:
> The logic is here
> https://github.com/apache/cassandra/blob/cassandra-1.1/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java#L78

Thanks Aaron for pointing to the code.

>
> a. n>r : I am assuming, have 1 replica in each rack.
>
> You have 1 replica in the first n racks.
>
> b. n<r : ?? I am assuming, try to equally distribute replicas across
> in each racks.
>
> int(n/r) racks will have the same number of replicas. n % r will have more.

Did you mean  r%n ( since r>n)  ?

Shouldn't the logic be : all racks will have at least int(r/n) and r%n
will have 1 additional replica ?

Sample use case ( r = 8, n = 3 )
n1 : 3 ( 2+1 )
n2:  3 ( 2+1 )
n3:  2

Is the above understanding correct ?

-Thanks,
Prasenjit

>
> This is why multi rack replication can be tricky.
>
> Hope that helps.
>
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 12/07/2012, at 8:05 PM, prasenjit mukherjee wrote:
>
> Thanks. Some follow up questions :
>
> 1.  How do the reads use strategy/snitch information ? I am assuming
> the reads can go to any of the replicas. WIll it also use the
> snitch/strategy info to find next 'R' replicas 'closest' to
> coordinator-node ?
>
> 2. In a single DC ( with n racks and r replicas ) what algorithm
> cassandra uses to write its replicas in following scenarios :
> a. n>r : I am assuming, have 1 replica in each rack.
> b. n<r : ?? I am assuming, try to equally distribute replicas across
> in each racks.
>
> -Thanks,
> Prasenjit
>
> On Thu, Jul 12, 2012 at 11:24 AM, Tyler Hobbs <tyler@datastax.com> wrote:
>
> I highly recommend specifying the same rack for all nodes (using
>
> cassandra-topology.properties) unless you really have a good reason not too
>
> (and you probably don't).  The way that replicas are chosen when multiple
>
> racks are in play can be fairly confusing and lead to a data imbalance if
>
> you don't catch it.
>
>
>
> On Wed, Jul 11, 2012 at 10:53 PM, prasenjit mukherjee <prasen.bea@gmail.com>
>
> wrote:
>
>
> As far as I know there isn't any way to use the rack name in the
>
> strategy_options for a keyspace. You
>
> might want to look at the code to dig into that, perhaps.
>
>
> Aha, I was wondering if I could do that as well ( specify rack options )
>
> :)
>
>
> Thanks for the pointer, I will dig into the code.
>
>
> -Thanks,
>
> Prasenjit
>
>
> On Thu, Jul 12, 2012 at 5:33 AM, Richard Lowe <richard.lowe@arkivum.com>
>
> wrote:
>
> If you then specify the parameters for the keyspace to use these, you
>
> can control exactly which set of nodes replicas end up on.
>
>
> For example, in cassandra-cli:
>
>
> create keyspace ks1 with placement_strategy =
>
> 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options
>
> = { DC1_realtime: 2, DC1_analytics: 1, DC2_realtime: 1 };
>
>
> As far as I know there isn't any way to use the rack name in the
>
> strategy_options for a keyspace. You might want to look at the code to dig
>
> into that, perhaps.
>
>
> Whichever snitch you use, the nodes are sorted in order of proximity to
>
> the client node. How this is determined depends on the snitch that's used
>
> but most (the ones that ship with Cassandra) will use the default ordering
>
> of same-node < same-rack < same-datacenter < different-datacenter. Each
>
> snitch has methods to tell Cassandra which rack and DC a node is in, so it
>
> always knows which node is closest. Used with the Bloom filters this can
>
> tell us where the nearest replica is.
>
>
>
>
> -----Original Message-----
>
> From: prasenjit mukherjee [mailto:prasen.bea@gmail.com]
>
> Sent: 11 July 2012 06:33
>
> To: user
>
> Subject: How to come up with a predefined topology
>
>
> Quoting from
>
> http://www.datastax.com/docs/0.8/cluster_architecture/replication#networktopologystrategy
>
> :
>
>
> "Asymmetrical replication groupings are also possible depending on your
>
> use case. For example, you may want to have three replicas per data center
>
> to serve real-time application requests, and then have a single replica in a
>
> separate data center designated to running analytics."
>
>
> Have 2 questions :
>
> 1. Any example how to configure a topology with 3 replicas in one DC (
>
> with 2 in 1 rack + 1 in another rack ) and one replica in another DC ?
>
> The default networktopologystrategy with rackinferringsnitch will only
>
> give me equal distribution ( 2+2 )
>
>
> 2. I am assuming the reads can go to any of the replicas. Is there a
>
> client which will send query to a node ( in cassandra ring ) which is
>
> closest to the client ?
>
>
> -Thanks,
>
> Prasenjit
>
>
>
>
>
>
>
> --
>
> Tyler Hobbs
>
> DataStax
>
>
>

Mime
View raw message