cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Віталій Тимчишин <tiv...@gmail.com>
Subject Re: Failing operations & repair
Date Sat, 09 Jun 2012 08:14:39 GMT
Thanks a lot. I was not sure if coordinator somehow tries to "roll-back"
transactions that failed to reach it's consistency level.
(Yet I could not imagine a method to do this, without 2-phase commit :) )

2012/6/8 aaron morton <aaron@thelastpickle.com>

> I am making some cassandra presentations in Kyiv and would like to check
> that I am telling people truth :)
>
> Thanks for spreading the word :)
>
> 1) Failed (from client-side view) operation may still be applied to cluster
>
> Yes.
> If you fail with UnavailableException it's because from the coordinators
> view of the cluster there is less than CL nodes available. So retry.
> Somewhat similar story with TimedOutException.
>
> 2) Coordinator does not try anything to "roll-back" operation that failed
> because it was processed by less then consitency level number of nodes.
>
> Correct.
>
> 3) Hinted handoff works only for successfull operations.
>
> HH will be stored if the coordinator proceeds with the request.
> In 1.X HH is stored on the coordinator if a replica is down when the
> request starts and if the node does not reply in rpc_timeout.
>
> 4) Counters are not reliable because of (1)
>
> If you get a TimedOutException when writing a counter you should not
> re-send the request.
>
> 5) Read-repair may help to propagate operation that was failed it's
> consistency level, but was persisted to some nodes.
>
> Yes. It works in the background, by default is only enabled on 10% of
> requests.
> Note that RR is not the same as the Consistent Level for read. If you work
> as a CL > ONE the results from CL nodes are always compared and differences
> resolved. RR is concerned with the replicas not involved in the CL read.
>
> 6) Manual repair is still needed because of (2) and (3)
>
> Manual repair is *the* was to achieve consistency of data on disk. HH and
> RR are optimisations designed to reduce the chance of a Digest Mismatch
> during a read with CL > ONE.
> It is also essential for distributing Tombstones before they are purged by
> compaction.
>
> P.S. If some points apply only to some cassandra versions, I will be happy
> to know this too.
>
> Assume everyone for version 1.X
>
> Thanks
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 8/06/2012, at 1:20 AM, Віталій Тимчишин wrote:
>
> Hello.
>
> I am making some cassandra presentations in Kyiv and would like to check
> that I am telling people truth :)
> Could community tell me if next points are true:
> 1) Failed (from client-side view) operation may still be applied to cluster
> 2) Coordinator does not try anything to "roll-back" operation that failed
> because it was processed by less then consitency level number of nodes.
> 3) Hinted handoff works only for successfull operations.
> 4) Counters are not reliable because of (1)
> 5) Read-repair may help to propagate operation that was failed it's
> consistency level, but was persisted to some nodes.
> 6) Manual repair is still needed because of (2) and (3)
>
> P.S. If some points apply only to some cassandra versions, I will be happy
> to know this too.
> --
> Best regards,
>  Vitalii Tymchyshyn
>
>
>


-- 
Best regards,
 Vitalii Tymchyshyn

Mime
View raw message