Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A041310F90 for ; Tue, 26 Nov 2013 18:19:31 +0000 (UTC) Received: (qmail 63841 invoked by uid 500); 26 Nov 2013 18:18:29 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 63764 invoked by uid 500); 26 Nov 2013 18:18:21 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 63699 invoked by uid 99); 26 Nov 2013 18:18:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Nov 2013 18:18:11 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of cjbottaro@academicworks.com designates 209.85.216.53 as permitted sender) Received: from [209.85.216.53] (HELO mail-qa0-f53.google.com) (209.85.216.53) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Nov 2013 18:18:05 +0000 Received: by mail-qa0-f53.google.com with SMTP id j5so8365205qaq.5 for ; Tue, 26 Nov 2013 10:17:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=academicworks.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=loQi98zosj3VV44YTxsptkp/TaRf5wEkLoLEcIactO8=; b=EQBMhTw46Bc76XiUQJ+3kseIqn4xrirJTO3kBwgCO2x8oBbLcKX0r4hZh6rdj2T/IX V6L62cRZImH0zCfzZWuUgoUTpwzlVFJwLyqGHhhFzFKJXxO12PJqMoo89fLwuNJRc4LF YSMfqss+22C7xTOEL/iuLVKCOpPURVEc/wc2Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=loQi98zosj3VV44YTxsptkp/TaRf5wEkLoLEcIactO8=; b=dTx2T7uqpLadGEeSNYTLBfeOdzwQxvu4SFQYOGvJdYCzBYVtG9xQ9hTQrSRLDYQ2PA Kv6Oy96os+cWQSXiJToAFKVgKxP+yilDX65JMGezQj5PwMPin25VFihRVvLqVgCAD/7t euVl9Dn4tOusdpIxW7ZCN1/QLxcPARDXYQFcNTm0ZIw4lf6FG25zZG6dxRc3beNlW9Sy 4RATbBM7XV4lHFWyz/VupglmK49taf+/ijixIe0Yfh0R6WZFH1Y+Q9gYeN4iYkSyt9Pd BkFOBrmVUSWPc/nckoppvzgIdZ+57T0HHcJpt/khXoq/zin96LIEuBlfXpxTPgEEMBqI uVEw== X-Gm-Message-State: ALoCoQmfoEdFUv4FMUV8brBg9YaU4GWDM5WKw62b2Z5mlKulcMDiAhg7sg/nE9trwW5IR7RhDqfS MIME-Version: 1.0 X-Received: by 10.224.69.69 with SMTP id y5mr57951981qai.53.1385489863587; Tue, 26 Nov 2013 10:17:43 -0800 (PST) Received: by 10.140.40.114 with HTTP; Tue, 26 Nov 2013 10:17:43 -0800 (PST) X-Originating-IP: [65.36.85.58] In-Reply-To: References: Date: Tue, 26 Nov 2013 12:17:43 -0600 Message-ID: Subject: Re: nodetool repair seems to increase linearly with number of keyspaces From: "Christopher J. Bottaro" To: Cassandra User Mailing List Content-Type: multipart/alternative; boundary=001a11c2fda828ec3804ec187f89 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c2fda828ec3804ec187f89 Content-Type: text/plain; charset=UTF-8 We only have a single CF per keyspace. Actually we have 2, but one is tiny (only has 2 rows in it and is queried once a month or less). Yup, using vnodes with 256 tokens. Cassandra 1.2.10. -- C On Mon, Nov 25, 2013 at 2:28 PM, John Pyeatt wrote: > Mr. Bottaro, > > About how many column families are in your keyspaces? We have 28 per > keyspace. > > Are you using Vnodes? We are and they are set to 256 > > What version of cassandra are you running. We are running 1.2.9 > > > On Mon, Nov 25, 2013 at 11:36 AM, Christopher J. Bottaro < > cjbottaro@academicworks.com> wrote: > >> We have the same setup: one keyspace per client, and currently about 300 >> keyspaces. nodetool repair takes a long time, 4 hours with -pr on a single >> node. We have a 4 node cluster with about 10 gb per node. Unfortunately, >> we haven't been keeping track of the running time as keyspaces, or load, >> increases. >> >> -- C >> >> >> On Wed, Nov 20, 2013 at 6:53 AM, John Pyeatt wrote: >> >>> We have an application that has been designed to use potentially 100s of >>> keyspaces (one for each company). >>> >>> One thing we are noticing is that nodetool repair across all of the >>> keyspaces seems to increase linearly based on the number of keyspaces. For >>> example, if we have a 6 node ec2 (m1.large) cluster across 3 Availability >>> Zones and create 20 keyspaces a nodetool repair -pr on one node takes 3 >>> hours even with no data in any of the keyspaces. If I bump that up to 40 >>> keyspaces it takes 6 hours. >>> >>> Is this the behaviour you would expect? >>> >>> Is there anything you can think of (short of redesigning the cluster to >>> limit keyspaces) to increase the performance of the nodetool repairs? >>> >>> My obvious concern is that as this application grows and we get more >>> companies using our it we will eventually have too many keyspaces to >>> perform repairs on the cluster. >>> >>> -- >>> John Pyeatt >>> Singlewire Software, LLC >>> www.singlewire.com >>> ------------------ >>> 608.661.1184 >>> john.pyeatt@singlewire.com >>> >> >> > > > -- > John Pyeatt > Singlewire Software, LLC > www.singlewire.com > ------------------ > 608.661.1184 > john.pyeatt@singlewire.com > --001a11c2fda828ec3804ec187f89 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
We only have a single CF per keyspace. =C2=A0Actually we h= ave 2, but one is tiny (only has 2 rows in it and is queried once a month o= r less).

Yup, using vnodes with 256 tokens.
Cassandra 1.2.10.

-- C


On Mon, Nov 25, 2013 at= 2:28 PM, John Pyeatt <john.pyeatt@singlewire.com> = wrote:
Mr. Bottaro,

Ab= out how many column families are in your keyspaces? We have 28 per keyspace= .

Are you using Vnodes? We are and they are set to 256

What = version of cassandra are you running. We are running 1.2.9

On Mon, Nov 25, 2013 at 11:36 AM, Christop= her J. Bottaro <cjbottaro@academicworks.com> wrote= :
We have the same setup: =C2= =A0one keyspace per client, and currently about 300 keyspaces. =C2=A0nodeto= ol repair takes a long time, 4 hours with -pr on a single node. =C2=A0We ha= ve a 4 node cluster with about 10 gb per node. =C2=A0Unfortunately, we have= n't been keeping track of the running time as keyspaces, or load, incre= ases.

-- C


On Wed, Nov 20, 2013 at 6:53 AM, = John Pyeatt <john.pyeatt@singlewire.com> wrote:
We have an applic= ation that has been designed to use potentially 100s of keyspaces (one for = each company).

One thing we are noticing is that nodetool repair across all of the key= spaces seems to increase linearly based on the number of keyspaces. For exa= mple, if we have a 6 node ec2 (m1.large) cluster across 3 Availability Zone= s and create 20 keyspaces a nodetool repair -pr on one node takes 3 hours e= ven with no data in any of the keyspaces. If I bump that up to 40 keyspaces= it takes 6 hours.

Is this the behaviour you would expect?

Is there anything = you can think of (short of redesigning the cluster to limit keyspaces) to i= ncrease the performance of the nodetool repairs?

My obvious co= ncern is that as this application grows and we get more companies using our= it we will eventually have too many keyspaces to perform repairs on the cl= uster.

--
John Pyeatt
Singlewire Software, LLC
www.singlewire.com=
------------------
608.6= 61.1184
john.pyeatt= @singlewire.com




--
John Pyeatt
Singlewire Software, LLC
www.singlewire.com=
------------------
608.6= 61.1184
john.pyeatt= @singlewire.com

--001a11c2fda828ec3804ec187f89--