Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 646BC18693 for ; Fri, 9 Oct 2015 14:09:06 +0000 (UTC) Received: (qmail 87481 invoked by uid 500); 9 Oct 2015 14:08:58 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 87458 invoked by uid 500); 9 Oct 2015 14:08:57 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 87448 invoked by uid 99); 9 Oct 2015 14:08:57 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Oct 2015 14:08:57 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 6FCAEC50D3 for ; Fri, 9 Oct 2015 14:08:57 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.15 X-Spam-Level: *** X-Spam-Status: No, score=3.15 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=3] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id JDWTsOtar1Vc for ; Fri, 9 Oct 2015 14:08:42 +0000 (UTC) Received: from mail-yk0-f170.google.com (mail-yk0-f170.google.com [209.85.160.170]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id ECE8420944 for ; Fri, 9 Oct 2015 14:08:41 +0000 (UTC) Received: by ykdg206 with SMTP id g206so79097060ykd.1 for ; Fri, 09 Oct 2015 07:08:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=4l3zQiCAgC+deWL8qXDAnKP5JBK+5DC9kJLDA3G5wiA=; b=peugelwrXYrs6tdKjIliiEyMIDPnd+i1csnIjCewOVT9DOMTsmPnqYzCNG+QgtdLcs xDl/kvEPiV65y0NcebzaC6EElwooAF6Lt4E81aZJOMpGOhaY4DHwr6mqJUF19h32Gr0b cft6esNY16REjx0bUcYXLDpMk3GERMOZEtZmIsUeOuq8G9EUgwXwI2/FDlo4bfzhlJU4 WdsK/+vQOtBqVB45iZzHVkBbzVSsUzQBHfyziCC8Vt+F1JQ2qJrP6GuBqUMWZPHqQqKt +bVKp7y2IgVvG0TnF8B3Fj4bv3syI0ZU+l2PoYXi5wD/OA0c7cwxqNy+mf8Sc8EpFYqf c4Ow== X-Received: by 10.13.239.1 with SMTP id y1mr9116671ywe.303.1444399714919; Fri, 09 Oct 2015 07:08:34 -0700 (PDT) MIME-Version: 1.0 Received: by 10.37.45.215 with HTTP; Fri, 9 Oct 2015 07:07:55 -0700 (PDT) In-Reply-To: References: From: sai krishnam raju potturi Date: Fri, 9 Oct 2015 10:07:55 -0400 Message-ID: Subject: Re: Re : Nodetool Cleanup on multiple nodes in parallel To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=94eb2c036320ebecff0521ac8345 --94eb2c036320ebecff0521ac8345 Content-Type: text/plain; charset=UTF-8 thanks Jonathan. I see a advantage in doing it one AZ or rack at a time. On Thu, Oct 8, 2015 at 6:41 PM, Jonathan Haddad wrote: > My hunch is the bigger your cluster the less impact it will have, as each > node takes part in smaller and smaller % of total queries. Considering > that compaction is always happening, I'd wager if you've got a big cluster > (as you say you do) you'll probably be ok running several cleanups at a > time. > > I'd say start one, see how your perf is impacted (if at all) and go from > there. > > If you're running a proper snitch you could probably do an entire rack / > AZ at a time. > > > On Thu, Oct 8, 2015 at 3:08 PM sai krishnam raju potturi < > pskraju88@gmail.com> wrote: > >> We plan to do it during non-peak hours when customer traffic is less. >> That sums up to 10 nodes a day, which is concerning as we have other data >> centers to be expanded eventually. >> >> Since cleanup is similar to compaction, which is CPU intensive and will >> effect reads if this data center were to serve traffic. Is running cleanup >> in parallel advisable?? >> >> On Thu, Oct 8, 2015, 17:53 Jonathan Haddad wrote: >> >>> Unless you're close to running out of disk space, what's the harm in it >>> taking a while? How big is your DC? At 45 min per node, you can do 32 >>> nodes a day. Diverting traffic away from a DC just to run cleanup feels >>> like overkill to me. >>> >>> >>> >>> On Thu, Oct 8, 2015 at 2:39 PM sai krishnam raju potturi < >>> pskraju88@gmail.com> wrote: >>> >>>> hi; >>>> our cassandra cluster currently uses DSE 4.6. The underlying >>>> cassandra version is 2.0.14. >>>> >>>> We are planning on adding multiple nodes to one of our datacenters. >>>> This requires "nodetool cleanup". The "nodetool cleanup" operation >>>> takes around 45 mins for each node. >>>> >>>> Datastax documentation recommends running "nodetool cleanup" for one >>>> node at a time. That would be really long, owing to the size of our >>>> datacenter. >>>> >>>> If we were to divert the read and write traffic away from a particular >>>> datacenter, could we run "cleanup" on multiple nodes in parallel for >>>> that datacenter?? >>>> >>>> >>>> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html >>>> >>>> >>>> thanks >>>> Sai >>>> >>> --94eb2c036320ebecff0521ac8345 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
thanks Jonathan. I see a advantage in doing it one AZ or r= ack at a time.=C2=A0

On Thu, Oct 8, 2015 at 6:41 PM, Jonathan Haddad = <jon@jonhaddad.co= m> wrote:
= My hunch is the bigger your cluster the less impact it will have, as each n= ode takes part in smaller and smaller % of total queries.=C2=A0 Considering= that compaction is always happening, I'd wager if you've got a big= cluster (as you say you do) you'll probably be ok running several clea= nups at a time. =C2=A0

I'd say start one, see how yo= ur perf is impacted (if at all) and go from there. =C2=A0

If you're running a proper snitch you could probably do an enti= re rack / AZ at a time.


On Thu, Oct = 8, 2015 at 3:08 PM sai krishnam raju potturi <pskraju88@gmail.com> wrote:
=

We plan to do it during non-p= eak hours when customer traffic is less. That sums up to 10 nodes a day, wh= ich is concerning as we have other data centers to be expanded eventually. =

Since cleanup is similar to compaction, which is CPU intensi= ve and will effect reads=C2=A0 if this data center were to serve traffic. I= s running cleanup in parallel advisable??


On Thu, Oct 8, 2015, 17:53= =C2=A0Jonathan Haddad <jon@jonhaddad.com> wrote:
Unless you're close to running out of disk space, = what's the harm in it taking a while?=C2=A0 How big is your DC?=C2=A0 A= t 45 min per node, you can do 32 nodes a day.=C2=A0 Diverting traffic away = from a DC just to run cleanup feels like overkill to me.



On Thu, Oc= t 8, 2015 at 2:39 PM sai krishnam raju potturi <pskraju88@gmail.com> wrote:
hi;
=C2=A0 =C2=A0our = cassandra cluster currently uses DSE 4.6. The underlying cassandra version = is 2.0.14.

We are planning on adding multiple node= s to one of our datacenters. This requires "nodetool cleanup". Th= e "nodetool cleanup&qu= ot; operation takes around 45 mins for each node.

Datastax documentation recommends running "nodetool cleanup" for one node at a time. That would be really long, owing to the size of our datacenter.=C2= =A0

If we were to divert the read and write traffic away fr= om a particular datacenter, could we run "cleanup" on multiple nodes in parallel for that datacenter??



thanks
Sai

--94eb2c036320ebecff0521ac8345--