Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9378F9246 for ; Thu, 14 Jun 2012 16:00:48 +0000 (UTC) Received: (qmail 26243 invoked by uid 500); 14 Jun 2012 16:00:46 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 26214 invoked by uid 500); 14 Jun 2012 16:00:46 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 26206 invoked by uid 99); 14 Jun 2012 16:00:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Jun 2012 16:00:46 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of tivv00@gmail.com designates 209.85.214.44 as permitted sender) Received: from [209.85.214.44] (HELO mail-bk0-f44.google.com) (209.85.214.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Jun 2012 16:00:40 +0000 Received: by bkty8 with SMTP id y8so1821624bkt.31 for ; Thu, 14 Jun 2012 09:00:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type; bh=KHBVwYgGooSA7QQvuHyLd4Yu8osEcX1oMzPs9AZOEtM=; b=oU4/JAguHYDib8VQDdbTPpqTRQ85OhfzsSXOv48xauHNCBaoWvFChtUffGhM7ccAuJ fuTDnjn6g7+/uqJFEMBzOppS71LvF8uAeEmFa7rIsK8fdAUM8hYf7s/EIUjFeYYc5+Z0 44I9zENP4kgbInUyQo9HXYt1nhuXWm5ybDZvm4iDm3kTcvSlIPVaP1DAZGBALxZ/J2BF f9Rgptp3UJFI96nbdOORAHkHYZ7uQsgnyFBLHKGyu7HPzJccM8TeCkbLtZy+BCY7ZXv+ R+dQtQKBgyQ6EiT9MmUzbrqXNt74Bk/bfnw5jvTT47Tl0x4USr5BCtSd+fotsFZTLK8m HlGw== Received: by 10.205.123.12 with SMTP id gi12mr631387bkc.41.1339689618939; Thu, 14 Jun 2012 09:00:18 -0700 (PDT) Received: from [10.64.1.26] ([94.45.140.16]) by mx.google.com with ESMTPS id fw10sm7216197bkc.11.2012.06.14.09.00.16 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 14 Jun 2012 09:00:16 -0700 (PDT) Message-ID: <4FDA0A93.50502@gmail.com> Date: Thu, 14 Jun 2012 19:00:19 +0300 From: Vitalii Tymchyshyn User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120430 Thunderbird/12.0.1 MIME-Version: 1.0 To: crypto five CC: user@cassandra.apache.org Subject: Re: Failing operations & repair References: <2B388F32-0289-42EE-AAA7-29264A973A1F@thelastpickle.com> In-Reply-To: Content-Type: multipart/alternative; boundary="------------000200090907050509020400" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------000200090907050509020400 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hello. For sure. Here they are: http://www.slideshare.net/vittim1/practical-cassandra Slides are in english. I've presented this presentation some time ago at JEEConf and once more yesterday in local developers club. There should be video recording (russian) available somewhen, but it's not here yet. Best regards, Vitalii Tymchyshyn 13.06.12 02:27, crypto five ???????(??): > It would be really great to look at your slides. Do you have any plans > to share your presentation? > > On Sat, Jun 9, 2012 at 1:14 AM, ??????? ???????? > wrote: > > Thanks a lot. I was not sure if coordinator somehow tries to > "roll-back" transactions that failed to reach it's consistency level. > (Yet I could not imagine a method to do this, without 2-phase > commit :) ) > > > 2012/6/8 aaron morton > > >> I am making some cassandra presentations in Kyiv and would >> like to check that I am telling people truth :) > Thanks for spreading the word :) > >> 1) Failed (from client-side view) operation may still be >> applied to cluster > Yes. > If you fail with UnavailableException it's because from the > coordinators view of the cluster there is less than CL nodes > available. So retry. Somewhat similar story with > TimedOutException. > >> 2) Coordinator does not try anything to "roll-back" operation >> that failed because it was processed by less then consitency >> level number of nodes. > Correct. > >> 3) Hinted handoff works only for successfull operations. > HH will be stored if the coordinator proceeds with the request. > In 1.X HH is stored on the coordinator if a replica is down > when the request starts and if the node does not reply in > rpc_timeout. > >> 4) Counters are not reliable because of (1) > If you get a TimedOutException when writing a counter you > should not re-send the request. > >> 5) Read-repair may help to propagate operation that was >> failed it's consistency level, but was persisted to some nodes. > Yes. It works in the background, by default is only enabled on > 10% of requests. > Note that RR is not the same as the Consistent Level for read. > If you work as a CL > ONE the results from CL nodes are always > compared and differences resolved. RR is concerned with the > replicas not involved in the CL read. > >> 6) Manual repair is still needed because of (2) and (3) > Manual repair is *the* was to achieve consistency of data on > disk. HH and RR are optimisations designed to reduce the > chance of a Digest Mismatch during a read with CL > ONE. > It is also essential for distributing Tombstones before they > are purged by compaction. >> P.S. If some points apply only to some cassandra versions, I >> will be happy to know this too. > Assume everyone for version 1.X > > Thanks > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 8/06/2012, at 1:20 AM, ??????? ???????? wrote: > >> Hello. >> >> I am making some cassandra presentations in Kyiv and would >> like to check that I am telling people truth :) >> Could community tell me if next points are true: >> 1) Failed (from client-side view) operation may still be >> applied to cluster >> 2) Coordinator does not try anything to "roll-back" operation >> that failed because it was processed by less then consitency >> level number of nodes. >> 3) Hinted handoff works only for successfull operations. >> 4) Counters are not reliable because of (1) >> 5) Read-repair may help to propagate operation that was >> failed it's consistency level, but was persisted to some nodes. >> 6) Manual repair is still needed because of (2) and (3) >> >> P.S. If some points apply only to some cassandra versions, I >> will be happy to know this too. >> -- >> Best regards, >> Vitalii Tymchyshyn > > > > > -- > Best regards, > Vitalii Tymchyshyn > > --------------000200090907050509020400 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Hello.

For sure. Here they are: http://www.slideshare.net/vittim1/practical-cassandra
Slides are in english.
I've presented this presentation some time ago at JEEConf and once more yesterday in local developers club.
There should be video recording (russian) available somewhen, but it's not here yet.

Best regards, Vitalii Tymchyshyn

13.06.12 02:27, crypto five написав(ла):
It would be really great to look at your slides. Do you have any plans to share your presentation?

On Sat, Jun 9, 2012 at 1:14 AM, Віталій Тимчишин <tivv00@gmail.com> wrote:
Thanks a lot. I was not sure if coordinator somehow tries to "roll-back" transactions that failed to reach it's consistency level.
(Yet I could not imagine a method to do this, without 2-phase commit :) )


2012/6/8 aaron morton <aaron@thelastpickle.com>
I am making some cassandra presentations in Kyiv and would like to check that I am telling people truth :)
Thanks for spreading the word :)

1) Failed (from client-side view) operation may still be applied to cluster
Yes. 
If you fail with UnavailableException it's because from the coordinators view of the cluster there is less than CL nodes available. So retry. Somewhat similar story with TimedOutException. 

2) Coordinator does not try anything to "roll-back" operation that failed because it was processed by less then consitency level number of nodes.
Correct.

3) Hinted handoff works only for successfull operations.
HH will be stored if the coordinator proceeds with the request.
In 1.X HH is stored on the coordinator if a replica is down when the request starts and if the node does not reply in rpc_timeout. 

4) Counters are not reliable because of (1)
If you get a TimedOutException when writing a counter you should not re-send the request. 

5) Read-repair may help to propagate operation that was failed it's consistency level, but was persisted to some nodes.
Yes. It works in the background, by default is only enabled on 10% of requests. 
Note that RR is not the same as the Consistent Level for read. If you work as a CL > ONE the results from CL nodes are always compared and differences resolved. RR is concerned with the replicas not involved in the CL read. 

6) Manual repair is still needed because of (2) and (3)
Manual repair is *the* was to achieve consistency of data on disk. HH and RR are optimisations designed to reduce the chance of a Digest Mismatch during a read with CL > ONE. 
It is also essential for distributing Tombstones before they are purged by compaction.
P.S. If some points apply only to some cassandra versions, I will be happy to know this too.
Assume everyone for version 1.X

Thanks

-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 8/06/2012, at 1:20 AM, Віталій Тимчишин wrote:

Hello.

I am making some cassandra presentations in Kyiv and would like to check that I am telling people truth :)
Could community tell me if next points are true:
1) Failed (from client-side view) operation may still be applied to cluster
2) Coordinator does not try anything to "roll-back" operation that failed because it was processed by less then consitency level number of nodes.
3) Hinted handoff works only for successfull operations.
4) Counters are not reliable because of (1)
5) Read-repair may help to propagate operation that was failed it's consistency level, but was persisted to some nodes.
6) Manual repair is still needed because of (2) and (3)

P.S. If some points apply only to some cassandra versions, I will be happy to know this too.
--
Best regards,
 Vitalii Tymchyshyn




--
Best regards,
 Vitalii Tymchyshyn


--------------000200090907050509020400--