incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikael Wikblom <mikael.wikb...@sitevision.se>
Subject Re: CL - locally consistent ONE
Date Fri, 28 Oct 2011 12:32:33 GMT
Thank you for the reply.

On 10/28/2011 11:45 AM, Peter Schuller wrote:
>> I've patched the classes WriteResponseHandler and ReadCallback to make sure
>> that the local node has returned before sending the condition signal. Can
>> anyone see any drawbacks with this approach? I realize this will only work
>> as long as the replication factor is the same as the number of nodes, but
>> that is ok for our scenario.
> So the "local" node is the co-ordinator. Is it the case that each CMS
> instance (with embedded cassandra) is always using "itself" as the
> co-ordinator, and that the requirement you have is that *that
> particular CMS instance* must see it's *own* writes? And the reason
> you are using RF=number of nodes is that you're wanting to make sure
> data is always on the local node?
yes this is case (you explain it much better than I did). We have an 
issue with invalidation of instance caches that also requires that 
RF=number of nodes (uses a row cache). Making sure all nodes contains 
the data will also make reads and writes blazing fast when using CL.ONE 
(the client does not have to wait for any remote calls to complete)
> If that is true, it *seems* to me it should work *kind of* as long as
> the CMS instances never ever use another Cassandra node *and* as long
> as you accept that a write may disappear in case of a sudden node
> failure (as usual with CL.ONE). I do think it feels like a fragile
> approach though, that would be nice to avoid if possible/realistic.
>
> I am curious as to performance though. It seems a lot safer to just
> use QUORUM *at least* for writes; keep in mind that regardless of
> CL.ONE your writes till go to all the other replicas (in this case all
> nodes since you say you have RF = cluster size) so in terms of
> throughput using CL.ONE should not be faster. It should be a bit
> better for latency in the common case though (which might translate to
> throughput for a sequential user). If you can do writes on QUORUM at
> least, even if not reads, you also avoid the problem of an
> acknowledged write disappearing in case of node failures.
During tests I've done mass mutations using an import of data. Using 
CL.QUORUM the import takes around 3 times longer that using CL.ONE on a 
cluster with 3 nodes. Due to our framework and the application logic 
involving imports, the import includes many reads along with the writes.

I think this is an expected behavior though, in the CL.ONE case more or 
less all reads / writes will be done locally whereas using QUORUM all 
operations are done locally and remotely.

Giving the locally consistent ONE approach we avoid consistency problems 
when the local node is slow, and the import time is more or less the 
same as with CL.ONE. We do however introduce potentially slower  reads / 
writes when the response time of local node it slower that any of the 
remote nodes,  but because cassandra is embedded in SiteVision the 
entire jvm will probably react slowly at this stage - a fast read from a 
remote node will not help much.

An other problem with using QUORUM is that it does not scale well in the 
case of big production environments. There are cases when a customer 
temporarily uses up to 10 nodes where QUORUM would mean reading / 
writing to 6 nodes.

> Are you in a position where the nodes in the cluster are wide apart
> (e.g. different DC:s), for the writes to be a significant problem for
> latency?
No gigabit network between the nodes, same net etc.

Kind regards

-- 
Mikael Wikblom
Software Architect
SiteVision AB
019-217058
mikael.wikblom@sitevision.se
http://www.sitevision.se


Mime
View raw message