From user-return-34629-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Mon Jun 17 12:33:53 2013 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1375D10A93 for ; Mon, 17 Jun 2013 12:33:53 +0000 (UTC) Received: (qmail 73952 invoked by uid 500); 17 Jun 2013 12:33:50 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 73438 invoked by uid 500); 17 Jun 2013 12:33:45 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 73430 invoked by uid 99); 17 Jun 2013 12:33:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Jun 2013 12:33:44 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of post@fantasista.no designates 213.236.237.140 as permitted sender) Received: from [213.236.237.140] (HELO mx1.mailserveren.com) (213.236.237.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Jun 2013 12:33:39 +0000 Received: from localhost ([127.0.0.1]) by mx1.mailserveren.com with esmtpa (Exim 4.80.1) (envelope-from ) id 1UoYcW-0005m6-CQ for user@cassandra.apache.org; Mon, 17 Jun 2013 14:33:16 +0200 Message-Id: <9402b69502d89cf72746300fe56669da61694aae@pop3.fantasista.no> From: "Vegard Berget" Reply-To: "Vegard Berget" To: user@cassandra.apache.org X-Mailer: Atmail 6.6.2.11727 X-Originating-IP: 46.19.16.3 in-reply-to: Subject: Re: Changing replication factor Date: Mon, 17 Jun 2013 14:33:16 +0200 Content-Type: multipart/alternative; boundary="=_2cbc8ce08b5f9abf145556e5ed123a2b" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --=_2cbc8ce08b5f9abf145556e5ed123a2b Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi,=0AThank you for the information.I have increased the rf, and I think= the=0Aincrease we have seen in cpu load etc is due to the counter cf's,= =0Awhich is almost write-only (reads a few times a day). =C2=A0The load= =0Aincrease is noticeable, but no problem.Repair went fine. =C2=A0But I= =0Anoticed that when I increased rf for a counter column and for (some= =0Acompletely different reasons) took one node down, and after that ran= =0ARepair I would get multiple lines in system.log:"invalid counter shar= d=0Adetected; (X, Y, Z) and (X, Y, Z2) differ only in count; will pick= =0Ahighest to self-heal; this indicates a bug or corruption generated a= =0Abad counter shard"I guess this is because that while the node was=0Ad= own, the counters gets out of sync and needs to just pick the=0Ahighest?= =C2=A0In my case this will be (more or less) correct, since the=0Async-= problem happened because of a downed node,which means _all_=0Aincreases= happens on the other node and that node will have the=0Acorrect number?= =C2=A0I am just curious, as some minor errors in the=0Acounters would b= e no problem for us.=0A.vegard,=0A----- Original Message -----=0AFrom: u= ser@cassandra.apache.org=0ATo:, "Vegard Berget" =0ACc:=0ASent:Fri, 14 Ju= n 2013 17:20:26 -0700=0ASubject:Re: Changing replication factor=0A=0A On= Mon, Jun 10, 2013 at 6:04 AM, Vegard Berget wrote:=0A > If one increas= es the replication factor of a keyspace and then do a=0Arepair,=0A > how= will this affect the performance of the affected nodes? Could=0Awe risk= =0A > the nodes being (more or less) unresponsive while repair is going= =0Aon?=0A=0A Repair is a relatively heavyweight activity (the heaviest a= cassandra=0A node can do!) which requires significant headroom in terms= of CPU,=0A heap memory and disk space. It is possible that nodes could= become=0A unavailable transiently during the repair, but unless they ar= e=0Aalready=0A very busy they should not become completely unresponsive.= For one=0A thing, both compaction and streaming respect throttles which= are=0A designed to minimize the impact of the streaming/compaction work= load=0A resulting from repair.=0A=0A > The nodes I am speaking of contai= ns ~100gb of data.=0A=0A This is a relatively small amount of data per n= ode, which makes the=0A impact of Repair less severe.=0A=0A > Also, some= of the keyspaces I am considering increase the=0Areplication factor=0A= > for contains Counter Column Families (has rf:1). I think I have=0Area= d that=0A > adding replication to counter cfs will affect performance=0A= negatively, is=0A > this correct?=0A=0A Per Sylvain (one of the primary= authors of the Counters codebase) [1]=0A:=0A=0A "=0A For counters, it's= a little bit different. At RF=3D3, for each inserts,=0A one node is doi= ng a write *and* a read, while the two other nodes are=0A only doing a= =0A write. So given that the read takes a time is non negligible, you=0A= should see simple=0A improvement a RF=3D3 compared to RF=3D1 because ea= ch node gets 1/3 of the=0A reads (involved in=0A the counter write) it w= ould get if it was the only replica. Now if=0Athe=0A write time=0A were= negligible compared to the read time, then yes you would see=0Aroughly= a 3x=0A increase. But while writes are still faster than reads in Cassa= ndra,=0A reads a now fairly=0A fast too (but all this depends on other f= actor like how much the=0A caches helps, etc...), so it=0A will likely b= e less than a 3x increase. Should be noticeable though."=0A "=0A=0A I in= terpret the above to mean that RF=3D3 is actually slightly *faster*=0A f= or Counters than RF=3D1.=0A=0A =3DRob=0A=0A [1]=0Ahttp://mail-archives.a= pache.org/mod_mbox/cassandra-user/201110.mbox/%3CCAKkz8Q0ThzzSBu2370MX6j= PeEC3Lh17Pjmv1koJGgAuaJupCtQ@mail.gmail.com%3E=0A --=_2cbc8ce08b5f9abf145556e5ed123a2b Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi,

Thank you for the info= rmation.
I have increased the rf, and I think the increase we= have seen in cpu load etc is due to the counter cf's, which is almost w= rite-only (reads a few times a day). =C2=A0The load increase is noticeab= le, but no problem.
Repair went fine. =C2=A0But I noticed that= when I increased rf for a counter column and for (some completely diffe= rent reasons) took one node down, and after that ran Repair I would get= multiple lines in system.log:
"invalid counter shard det= ected; (X, Y, Z) and (X, Y, Z2) differ only in count; will pick highest= to self-heal; this indicates a bug or corruption generated a bad counte= r shard"
I guess this is because that while the node was down,= the counters gets out of sync and needs to just pick the highest? =C2= =A0In my case this will be (more or less) correct, since the sync-proble= m happened because of a downed node,which means _all_ increases happens= on the other node and that node will have the correct number? =C2=A0I a= m just curious, as some minor errors in the counters would be no problem= for us.

.vegard,

----= - Original Message -----
From:
user@cassandra= apache.org

To:
<use= r@cassandra.apache.org>, "Vegard Berget" <post@fantasista.no>
Cc:

Sent:
Fri, 14 Jun 2013 17:20:26 -0700
Subject:
Re: Changing replication factor

=0AOn Mon, Jun 10, 2013 at 6:04 AM, Vegard Berget <po= st@fantasista.no> wrote:
=0A> If one increases the replicatio= n factor of a keyspace and then do a repair,
=0A> how will this= affect the performance of the affected nodes? Could we risk
=0A>= ; the nodes being (more or less) unresponsive while repair is going on?<= br />
=0ARepair is a relatively heavyweight activity (the heaviest= a cassandra
=0Anode can do!) which requires significant headroom i= n terms of CPU,
=0Aheap memory and disk space. It is possible that= nodes could become
=0Aunavailable transiently during the repair, b= ut unless they are already
=0Avery busy they should not become comp= letely unresponsive. For one
=0Athing, both compaction and streamin= g respect throttles which are
=0Adesigned to minimize the impact of= the streaming/compaction workload
=0Aresulting from repair.
<= br />=0A> The nodes I am speaking of contains ~100gb of data.
<= br />=0AThis is a relatively small amount of data per node, which makes= the
=0Aimpact of Repair less severe.

=0A> Also, some= of the keyspaces I am considering increase the replication factor
= =0A> for contains Counter Column Families (has rf:1). I think I have= read that
=0A> adding replication to counter cfs will affect pe= rformance negatively, is
=0A> this correct?

=0APer Sy= lvain (one of the primary authors of the Counters codebase) [1] :
=
=0A"
=0AFor counters, it's a little bit different. At RF=3D3,= for each inserts,
=0Aone node is doing a write *and* a read, while= the two other nodes are
=0Aonly doing a
=0Awrite. So given th= at the read takes a time is non negligible, you
=0Ashould see simpl= e
=0Aimprovement a RF=3D3 compared to RF=3D1 because each node gets= 1/3 of the
=0Areads (involved in
=0Athe counter write) it wou= ld get if it was the only replica. Now if the
=0Awrite time
= =0Awere negligible compared to the read time, then yes you would see rou= ghly a 3x
=0Aincrease. But while writes are still faster than reads= in Cassandra,
=0Areads a now fairly
=0Afast too (but all this= depends on other factor like how much the
=0Acaches helps, etc...)= , so it
=0Awill likely be less than a 3x increase. Should be notice= able though."
=0A"

=0AI interpret the above to mean that= RF=3D3 is actually slightly *faster*
=0Afor Counters than RF=3D1.<= br />
=0A=3DRob

=0A[1] http://mail-archives.apache.org/m= od_mbox/cassandra-user/201110.mbox/%3CCAKkz8Q0ThzzSBu2370MX6jPeEC3Lh17Pj= mv1koJGgAuaJupCtQ@mail.gmail.com%3E
--=_2cbc8ce08b5f9abf145556e5ed123a2b--