Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of cjbottaro@academicworks.com
 designates 209.85.216.53 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAEisTL=u8wa939u72tCoZyN7gpK5wzGWfLzWv-OyZOZ1_8QQUQ@mail.gmail.com>
References: 
 <CAEisTLmnfNMrgBGGfgqgzyBL95JX_v5qeMWnLLFMhgQzXhUyuA@mail.gmail.com>
	<CAAw6nKsKq3=OrUBrVXn327_4RB_URq+S1FjEz-yELDnh+13gwQ@mail.gmail.com>
	<CAEisTL=u8wa939u72tCoZyN7gpK5wzGWfLzWv-OyZOZ1_8QQUQ@mail.gmail.com>
Date: Tue, 26 Nov 2013 12:17:43 -0600
Message-ID: 
 <CAAw6nKs1ECye8jMR3w+jfadd1=OJVPOwkRTU4S4+f-Adq9zveg@mail.gmail.com>
Subject: Re: nodetool repair seems to increase linearly with number of
 keyspaces
From: "Christopher J. Bottaro" <cjbottaro@academicworks.com>
To: Cassandra User Mailing List <user@cassandra.apache.org>
Content-Type: multipart/alternative; boundary=001a11c2fda828ec3804ec187f89

--001a11c2fda828ec3804ec187f89
Content-Type: text/plain; charset=UTF-8

We only have a single CF per keyspace.  Actually we have 2, but one is tiny
(only has 2 rows in it and is queried once a month or less).

Yup, using vnodes with 256 tokens.

Cassandra 1.2.10.

-- C


On Mon, Nov 25, 2013 at 2:28 PM, John Pyeatt <john.pyeatt@singlewire.com>wrote:

> Mr. Bottaro,
>
> About how many column families are in your keyspaces? We have 28 per
> keyspace.
>
> Are you using Vnodes? We are and they are set to 256
>
> What version of cassandra are you running. We are running 1.2.9
>
>
> On Mon, Nov 25, 2013 at 11:36 AM, Christopher J. Bottaro <
> cjbottaro@academicworks.com> wrote:
>
>> We have the same setup:  one keyspace per client, and currently about 300
>> keyspaces.  nodetool repair takes a long time, 4 hours with -pr on a single
>> node.  We have a 4 node cluster with about 10 gb per node.  Unfortunately,
>> we haven't been keeping track of the running time as keyspaces, or load,
>> increases.
>>
>> -- C
>>
>>
>> On Wed, Nov 20, 2013 at 6:53 AM, John Pyeatt <john.pyeatt@singlewire.com>wrote:
>>
>>> We have an application that has been designed to use potentially 100s of
>>> keyspaces (one for each company).
>>>
>>> One thing we are noticing is that nodetool repair across all of the
>>> keyspaces seems to increase linearly based on the number of keyspaces. For
>>> example, if we have a 6 node ec2 (m1.large) cluster across 3 Availability
>>> Zones and create 20 keyspaces a nodetool repair -pr on one node takes 3
>>> hours even with no data in any of the keyspaces. If I bump that up to 40
>>> keyspaces it takes 6 hours.
>>>
>>> Is this the behaviour you would expect?
>>>
>>> Is there anything you can think of (short of redesigning the cluster to
>>> limit keyspaces) to increase the performance of the nodetool repairs?
>>>
>>> My obvious concern is that as this application grows and we get more
>>> companies using our it we will eventually have too many keyspaces to
>>> perform repairs on the cluster.
>>>
>>> --
>>> John Pyeatt
>>> Singlewire Software, LLC
>>> www.singlewire.com
>>> ------------------
>>> 608.661.1184
>>> john.pyeatt@singlewire.com
>>>
>>
>>
>
>
> --
> John Pyeatt
> Singlewire Software, LLC
> www.singlewire.com
> ------------------
> 608.661.1184
> john.pyeatt@singlewire.com
>

--001a11c2fda828ec3804ec187f89
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">We only have a single CF per keyspace. =C2=A0Actually we h=
ave 2, but one is tiny (only has 2 rows in it and is queried once a month o=
r less).<div><br></div><div>Yup, using vnodes with 256 tokens.</div><div><b=
r></div>
<div>Cassandra 1.2.10.</div><div><br></div><div>-- C</div></div><div class=
=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Mon, Nov 25, 2013 at=
 2:28 PM, John Pyeatt <span dir=3D"ltr">&lt;<a href=3D"mailto:john.pyeatt@s=
inglewire.com" target=3D"_blank">john.pyeatt@singlewire.com</a>&gt;</span> =
wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div>Mr. Bottaro,<br><br>Ab=
out how many column families are in your keyspaces? We have 28 per keyspace=
.<br>
<br>Are you using Vnodes? We are and they are set to 256<br><br></div>What =
version of cassandra are you running. We are running 1.2.9<br>
</div><div class=3D"HOEnZb"><div class=3D"h5"><div class=3D"gmail_extra"><b=
r><br><div class=3D"gmail_quote">On Mon, Nov 25, 2013 at 11:36 AM, Christop=
her J. Bottaro <span dir=3D"ltr">&lt;<a href=3D"mailto:cjbottaro@academicwo=
rks.com" target=3D"_blank">cjbottaro@academicworks.com</a>&gt;</span> wrote=
:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">We have the same setup: =C2=
=A0one keyspace per client, and currently about 300 keyspaces. =C2=A0nodeto=
ol repair takes a long time, 4 hours with -pr on a single node. =C2=A0We ha=
ve a 4 node cluster with about 10 gb per node. =C2=A0Unfortunately, we have=
n&#39;t been keeping track of the running time as keyspaces, or load, incre=
ases.<span><font color=3D"#888888"><div>


<br></div><div>-- C</div></font></span></div><div><div><div class=3D"gmail_=
extra"><br><br><div class=3D"gmail_quote">On Wed, Nov 20, 2013 at 6:53 AM, =
John Pyeatt <span dir=3D"ltr">&lt;<a href=3D"mailto:john.pyeatt@singlewire.=
com" target=3D"_blank">john.pyeatt@singlewire.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div><div>We have an applic=
ation that has been designed to use potentially 100s of keyspaces (one for =
each company).<br>


<br>One thing we are noticing is that nodetool repair across all of the key=
spaces seems to increase linearly based on the number of keyspaces. For exa=
mple, if we have a 6 node ec2 (m1.large) cluster across 3 Availability Zone=
s and create 20 keyspaces a nodetool repair -pr on one node takes 3 hours e=
ven with no data in any of the keyspaces. If I bump that up to 40 keyspaces=
 it takes 6 hours.<br>


<br></div>Is this the behaviour you would expect?<br><br>Is there anything =
you can think of (short of redesigning the cluster to limit keyspaces) to i=
ncrease the performance of the nodetool repairs?<br><br></div>My obvious co=
ncern is that as this application grows and we get more companies using our=
 it we will eventually have too many keyspaces to perform repairs on the cl=
uster.<span><font color=3D"#888888"><br clear=3D"all">


<div><div><div><br>-- <br>
John Pyeatt<br>

Singlewire Software, LLC<br>
<a href=3D"http://www.singlewire.com/" target=3D"_blank">www.singlewire.com=
</a><br>
------------------<br>
<a href=3D"tel:608.661.1184" value=3D"+16086611184" target=3D"_blank">608.6=
61.1184</a><br><font color=3D"#888888">
<a href=3D"mailto:john.pyeatt@singlewire.com" target=3D"_blank">john.pyeatt=
@singlewire.com</a></font>
</div></div></div></font></span></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><br>-- <br>
John Pyeatt<br>

Singlewire Software, LLC<br>
<a href=3D"http://www.singlewire.com/" target=3D"_blank">www.singlewire.com=
</a><br>
------------------<br>
<a href=3D"tel:608.661.1184" value=3D"+16086611184" target=3D"_blank">608.6=
61.1184</a><br><font color=3D"#888888">
<a href=3D"mailto:john.pyeatt@singlewire.com" target=3D"_blank">john.pyeatt=
@singlewire.com</a></font>
</div>
</div></div></blockquote></div><br></div>

--001a11c2fda828ec3804ec187f89--