incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pierre Chalamet" <pie...@chalamet.net>
Subject RE: Get CL ONE / NTS
Date Wed, 14 Sep 2011 21:57:04 GMT
After reading Cassandra source code, I will try to answer myself. It's kind
of good exercise :)

>1/ Will I have an error because DC2 does not have any copy of the data ? 
I've not been able to find how endpoints are determined for the read
request, but I guess endpoints are just coming from the current datacenter.

>2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2 ?
Probably no since 1/

>3/ In case of partial replication to DC2, will I see sometimes errors about
servers not holding the data in DC2 ?
It seems to depend on RR. If read_repair_chance is set to 1 (default value),
RR happens all the time : the answer is no.
In case read_repair_chance is below 1, it seems CL.ONE will fail if the
single read request fails.

>4/ Does Get CL ONE failed as soon as the fastest server to answer tell it
does not have the data or does it waits until all servers tell they do not
have the data ?
It seems to depend on RR as in 3/

Are the answers right ?

- Pierre


-----Original Message-----
From: Pierre Chalamet [mailto:pierre@chalamet.net] 
Sent: Wednesday, September 14, 2011 3:33 PM
To: user@cassandra.apache.org
Subject: Get CL ONE / NTS

Hello,

I have 2 datacenters. Cassandra is configured as follow:
- RackInferringSnitch
- NetworkTopologyStrategy for CF
- strategy_options: DC1:3 DC2:3

Data are written using CL LOCAL_QUORUM so data written from one datacenter
will eventually be replicated to the other datacenter. Data is always
written exactly once. 

On the other side, I'd like to improve the read path. I'm using actually the
CL ONE since data is only written once (ie: timestamp is more or less
meaningless in my case).

This is where I have some doubts: if data is written on DC1 and tentatively
read from DC2 while the data is still not replicated or partially replicated
(for whatever good reason since replication is async), what is the behavior
of Get with CL ONE / NTS ? 
1/ Will I have an error because DC2 does not have any copy of the data ? 
2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2 ?
3/ In case of partial replication to DC2, will I see sometimes errors about
servers not holding the data in DC2 ?
4/ Does Get CL ONE failed as soon as the fastest server to answer tell it
does not have the data or does it waits until all servers tell they do not
have the data ? 

Thanks a lot,
- Pierre


Mime
View raw message