Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5425DDE03 for ; Thu, 16 May 2013 11:49:58 +0000 (UTC) Received: (qmail 59451 invoked by uid 500); 16 May 2013 11:49:55 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 59233 invoked by uid 500); 16 May 2013 11:49:55 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 59210 invoked by uid 99); 16 May 2013 11:49:54 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 May 2013 11:49:54 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of arodrime@gmail.com designates 209.85.215.44 as permitted sender) Received: from [209.85.215.44] (HELO mail-la0-f44.google.com) (209.85.215.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 May 2013 11:49:50 +0000 Received: by mail-la0-f44.google.com with SMTP id fr10so2915003lab.31 for ; Thu, 16 May 2013 04:49:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=bQP8p+3BzhvrLu7aY2HmpsYbTPCxQZt51JhZ46mgChE=; b=qzxoaqUWsW4ZTS2voEEVN5umz7v+xNYLSGDZwYxxOOCvE4atm2Do8daY7QqY7n2dt1 K9QLt4RaEwlWyQdRMUrx7t7YHng2BBz/xAH8F2BGGZ9igCltDbk4d/Ps/012nGLjxVQy MwNWuFwb6TnFPF4fjF6aaNaJuLhjmsgJO00uJMfQDi3ukHME644jxgSbrI1DNIkgAwDc ga0OJ/rsMZ8AqN5Kl/lBpakMThNIliCYu/oCu+u7L8CVYoC7JKWk68QpE84lP9w92jdM sjPJwda37RbIr/cFIDDQUecIzhztQLhRBDIaLwYQNZ4Fky5LgXBg0krzFdi3RjfYwOLD Ehkg== X-Received: by 10.152.27.170 with SMTP id u10mr13122587lag.45.1368704968354; Thu, 16 May 2013 04:49:28 -0700 (PDT) MIME-Version: 1.0 Received: by 10.112.168.1 with HTTP; Thu, 16 May 2013 04:49:08 -0700 (PDT) In-Reply-To: References: <1368547830.22785.GenericBBA@web160901.mail.bf1.yahoo.com> From: Alain RODRIGUEZ Date: Thu, 16 May 2013 13:49:08 +0200 Message-ID: Subject: Re: (unofficial) Community Poll for Production Operators : Repair To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=089e0160a3b870f26f04dcd47509 X-Virus-Checked: Checked by ClamAV on apache.org --089e0160a3b870f26f04dcd47509 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable @Rob: Thanks about the feedback. Yet I have a weird behavior still unexplained about repairing. Are counters supposed to be "repaired" too ? I mean, while reading at CL.ONE I can have different values depending on what node is answering. Even after a read repair or a full repair. Shouldn't a repair fix these discrepancies ? The only way I found to get always the same count is to read data at CL.QUORUM, but this is a workaround since the data itself remains wrong on some nodes. Any clue on it ? Alain 2013/5/15 Edward Capriolo > http://basho.com/introducing-riak-1-3/ > > Introduced Active Anti-Entropy. Riak now has active anti-entropy. In > distributed systems, inconsistencies can arise between replicas due to > failure modes, concurrent updates, and physical data loss or corruption. > Pre-1.3 Riak already had several features for repairing this =93entropy= =94, but > they all required some form of user intervention. Riak 1.3 introduces > automatic, self-healing properties that repair entropy on an ongoing basi= s. > > > On Wed, May 15, 2013 at 5:32 PM, Robert Coli wrote= : > >> On Wed, May 15, 2013 at 1:27 AM, Alain RODRIGUEZ >> wrote: >> > Rob, I was wondering something. Are you a commiter working on improvin= g >> the >> > repair or something similar ? >> >> I am not a committer [1], but I have an active interest in potential >> improvements to the best practices for repair. The specific change >> that I am considering is a modification to the default >> gc_grace_seconds value, which seems picked out of a hat at 10 days. My >> view is that the current implementation of repair has such negative >> performance consequences that I do not believe that holding onto >> tombstones for longer than 10 days could possibly be as bad as the >> fixed cost of running repair once every 10 days. I believe that this >> value is too low for a default (it also does not map cleanly to the >> work week!) and likely should be increased to 14, 21 or 28 days. >> >> > Anyway, if a commiter (or any other expert) could give us some feedbac= k >> on >> > our comments (Are we doing well or not, whether things we observe are >> normal >> > or unexplained, what is going to be improved in the future about >> repair...) >> >> 1) you are doing things according to best practice >> 2) unfortunately your experience with significantly degraded >> performance, including a blocked go-live due to repair bloat is pretty >> typical >> 3) the things you are experiencing are part of the current >> implementation of repair and are also typical, however I do not >> believe they are fully "explained" [2] >> 4) as has been mentioned further down thread, there are discussions >> regarding (and some already committed) improvements to both the >> current repair paradigm and an evolution to a new paradigm >> >> Thanks to all for the responses so far, please keep them coming! :D >> >> =3DRob >> [1] hence the (unofficial) tag for this thread. I do have minor >> patches accepted to the codebase, but always merged by an actual >> committer. :) >> [2] driftx@#cassandra feels that these things are explained/understood >> by core team, and points to >> https://issues.apache.org/jira/browse/CASSANDRA-5280 as a useful >> approach to minimize same. >> > > --089e0160a3b870f26f04dcd47509 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable
@Rob: Thanks about the feedback.

Yet I = have a weird behavior still unexplained about repairing. Are counters suppo= sed to be "repaired" too ? I mean, while reading at CL.ONE I can = have different values depending on what node is answering. Even after a rea= d repair or a full repair. Shouldn't a repair fix these discrepancies ?=

The only way I found to get always the same count is to= read data at CL.QUORUM, but this is a workaround since the data itself rem= ains wrong on some nodes.=A0

Any clue on it ?

Alain

2013/5/15 Edward Capriol= o <edlinuxguru@gmail.com>
http://basho.com/introducing-riak-1-3/

Introduc= ed Active Anti-Entropy. Riak now has active anti-entropy. In distributed sy= stems, inconsistencies can arise between replicas due to failure modes, con= current updates, and physical data loss or corruption. Pre-1.3 Riak already= had several features for repairing this =93entropy=94, but they all requir= ed some form of user intervention. Riak 1.3 introduces automatic, self-heal= ing properties that repair entropy on an ongoing basis.


On Wed, May 15, 2013 at 5:32 PM, Robert Coli <= ;rcoli@eventbrite= .com> wrote:
On Wed, May 15, 2013 at 1:27 AM, Alain RODRI= GUEZ <arodrime@g= mail.com> wrote:
> Rob, I was wondering something. Are you a commiter working on improvin= g the
> repair or something similar ?

I am not a committer [1], but I have an active interest in potential
improvements to the best practices for repair. The specific change
that I am considering is a modification to the default
gc_grace_seconds value, which seems picked out of a hat at 10 days. My
view is that the current implementation of repair has such negative
performance consequences that I do not believe that holding onto
tombstones for longer than 10 days could possibly be as bad as the
fixed cost of running repair once every 10 days. I believe that this
value is too low for a default (it also does not map cleanly to the
work week!) and likely should be increased to 14, 21 or 28 days.

> Anyway, if a commiter (or any other expert) could give us some feedbac= k on
> our comments (Are we doing well or not, whether things we observe are = normal
> or unexplained, what is going to be improved in the future about repai= r...)

1) you are doing things according to best practice
2) unfortunately your experience with significantly degraded
performance, including a blocked go-live due to repair bloat is pretty
typical
3) the things you are experiencing are part of the current
implementation of repair and are also typical, however I do not
believe they are fully "explained" [2]
4) as has been mentioned further down thread, there are discussions
regarding (and some already committed) improvements to both the
current repair paradigm and an evolution to a new paradigm

Thanks to all for the responses so far, please keep them coming! :D

=3DRob
[1] hence the (unofficial) tag for this thread. I do have minor
patches accepted to the codebase, but always merged by an actual
committer. :)
[2] driftx@#cassandra feels that these things are explained/understood
by core team, and points to
https://issues.apache.org/jira/browse/CASSANDRA-5280 as a usefu= l
approach to minimize same.


--089e0160a3b870f26f04dcd47509--