incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wouter de Bie <wouter.de....@deltaprojects.se>
Subject Re: Consitency level ONE
Date Tue, 29 Jun 2010 12:33:42 GMT

On Jun 29, 2010, at 1:03 PM, Sylvain Lebresne wrote:

>> Hi all,
>> 
>> I'm having some issues with read consistency level ONE. The Wiki (and other sources)
say the following:
>> 
>> Will return the record returned by the first node to respond. A consistency check
is always done in a background thread to fix any consistency issues when ConsistencyLevel.ONE
is used. This means subsequent calls will have correct data even if the initial read gets
an older value. (This is called read repair.)
>> 
>> However, when looking at the code, it seems that the read is only directed towards
the first node that is suitable (and alive). This means that a slow node will cause slow responses
even though my replication factor is > 1. I would expect the read to go to all the suitable
nodes and as soon as one of those nodes responds, the reply is used (just as the documentation
says).
>> 
>> Moving to Quorum reads would solve part of this problem, but with one server down
and 1 slow one, I'm back to square one.
> 
> This would not solve part of the problem.
> 
> When you do a QUORUM read, the value(s) of asked column(s) are not requested
> from each replica. Instead, the value is asked to one node and only a digest
> of the value is asked to the other nodes. This is done to avoid too much
> inter-cluster transfer (and thus save bandwidth, and thus make it more
> efficient) as in normal condition you expect all value to be exactly the same
> and thus transferring all those data would be wasteful. If ever the value and
> the digest doesn't match, then only are the actual value requested.
> 
> Same thing for CL.ONE. The background consistency check only really ask for
> digests, which save a lot of internal bandwidth.
> 
> Now, back to the slow node problem. The code already do it's best to ask the
> best suited node. First by retrieving the data locally if possible, then using
> the EndpointSnitch that you can configure to tell Cassandra what is this best
> suited node.
> There is the problem of slow node because of temporary problem, either network
> problem or because this node is too loaded and cannot keep-up. But Cassandra
> choose to optimize for the normal case rather than the error case, which I
> believe is the right choice.

Thanks for the explanation. It looks that the dynamic endpoint snitch would be helping me
in the 0.7 release.

Greetings,

Wouter
Mime
View raw message