cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Colby <jonathan.co...@gmail.com>
Subject Re: Location-aware replication based on objects' access pattern
Date Wed, 06 Apr 2011 07:26:42 GMT
good to see a discussion on this. 

This also has practical use for business continuity where you can control that the clients
in a given data center first write replicas to its own data center, then to the other data
center for backup.  If I understand correctly, a write takes the token into account first,
then the replication strategy decides where the replicas go.   I would like to see the the
first writes to be based on "location" instead of token -   whether that is accomplished by
manipulating the key or some other mechanism.

That way, if you do suffer the loss of a data center,  the clients are guaranteed to meet
quorum on the nodes in its own data center  (given  a mirrored architecture across 2 data
centers).

We have 2 data centers.  If one goes down we have the problem that quorum cannot be satisfied
for half of the reads.


On Apr 6, 2011, at 6:00 AM, Jonathan Ellis wrote:

> On Tue, Apr 5, 2011 at 10:45 PM, Yudong Gao <stgyd@umich.edu> wrote:
>>> A better solution would be to just push the DecoratedKey into the
>>> ReplicationStrategy so it can make its decision before information is
>>> thrown away.
>> 
>> I agree. So in this case, I guess the hashed based token ring is still
>> preserved to avoid hot spot, but we further use the DecoratedKey to
>> guide the replication strategy. For example, replica 2 is placed in
>> the first node along the ring the belongs the desirable data center
>> (based on the location hint embedded DecoratedKey). But we may not be
>> able to control the primary replica. Do you think this will be a
>> reasonable design?
> 
> calculateNaturalEndpoints has complete freedom to generate all
> replicas any way it likes.  Thinking of an endpoint as "primary"
> because it was generated first by one algorithm is dangerous.
> 
> As one of the docstrings explains, replica destinations ("endpoints")
> should be considered a Set even though we use a List for efficiency.
> None of them are special at the ReplicationStrategy level.
> 
>> Just curious, are they happy with the current
>> solution with keyspace, and is there some requests for per-row
>> placement control?
> 
> Enough people want to try it that we have the ticket open. :)
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com


Mime
View raw message