From user-return-64034-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org Thu Jun 13 12:09:09 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id E560D18064E for ; Thu, 13 Jun 2019 14:09:08 +0200 (CEST) Received: (qmail 8179 invoked by uid 500); 13 Jun 2019 12:09:05 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 8130 invoked by uid 99); 13 Jun 2019 12:09:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Jun 2019 12:09:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id C1EFF1A33F3 for ; Thu, 13 Jun 2019 12:09:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.801 X-Spam-Level: * X-Spam-Status: No, score=1.801 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=mailjet.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id sZGOss4gXdLc for ; Thu, 13 Jun 2019 12:09:01 +0000 (UTC) Received: from mail-oi1-f169.google.com (mail-oi1-f169.google.com [209.85.167.169]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 221EF5FB9C for ; Thu, 13 Jun 2019 12:08:58 +0000 (UTC) Received: by mail-oi1-f169.google.com with SMTP id s184so14213455oie.9 for ; Thu, 13 Jun 2019 05:08:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=6oKPWOdiLtCwVFNKiAlC1cxr0untdQcfYsODwgPs1yg=; b=tH37j3Z1sGroHk8G6c+jVW0kmoWldOrDTaY8Yrjpyi4fT273wn0S6CzyOvTOwrap7v NeKNvg2Bwq99Tg/ONe0pvL+qS3GNwMrKdzW2FNn9l5dKKrSOQ8O28hJxh9NqmBRhFgFO rk72mBfNsF/vJ7T408tBIfzdR83adEIYJi0ZOssa+P7Nn+OtdTuHPgVAzKLqhcWjd5YL bG7wx3g51Oe/ldjr5TNInutJfe0k/eO0F/UB8KVA/d6u8Rt1pdxnG18URDbawijVXFNz 5XDMUj1GQLVkMHEN2I8MMOYg8hyQu13T0NVWjvpcOnDGrYD0mh5O/AYukFKd7CuGITxf e3cQ== X-Gm-Message-State: APjAAAXlsI9L9HFnMDoNfRCeoBX407uqXhOfppCHHl8NEYH6mSey3wew JOMEAEGgS4jqILUe90DSMc1kXEcei6Q6GBCJKQjeMrJ00Nk= X-Google-Smtp-Source: APXvYqzo7T7nokGvxqxkr8ptKrG1NqFCzRk4WNgChmGz3HyFfEUBBKXzgpWYgZ/6fpW/C6hrMGsHmGzIMnHVo+cFFAA= X-Received: by 2002:aca:ef43:: with SMTP id n64mr2824513oih.79.1560427736497; Thu, 13 Jun 2019 05:08:56 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: =?UTF-8?Q?L=C3=A9o_FERLIN_SUTTON?= Date: Thu, 13 Jun 2019 14:08:45 +0200 Message-ID: Subject: Re: very slow repair To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary="000000000000eea2bc058b33648b" --000000000000eea2bc058b33648b Content-Type: text/plain; charset="UTF-8" > > Last, but not least: are you using the default number of vnodes, 256? The > overhead of large number of vnodes (times the number of nodes), can be > quite significant. We've seen major improvements in repair runtime after > switching from 256 to 16 vnodes on Cassandra version 3.0. Is there a recommended procedure to switch the amount of vnodes ? Regards, Leo On Thu, Jun 13, 2019 at 12:06 PM Oleksandr Shulgin < oleksandr.shulgin@zalando.de> wrote: > On Thu, Jun 13, 2019 at 10:36 AM R. T. > wrote: > >> >> Well, actually by running cfstats I can see that the totaldiskspaceused >> is about ~ 1.2 TB per node in the DC1 and ~ 1 TB per node in DC2. DC2 was >> off for a while thats why there is a difference in space. >> >> I am using Cassandra 3.0.6 and >> my stream_throughput_outbound_megabits_per_sec is th4e default setting so >> according to my version is (200 Mbps or 25 MB/s) >> > > And the other setting: compaction_throughput_mb_per_sec? It is also > highly relevant for repair performance, as streamed in files need to be > compacted with the existing files on the nodes. In our experience change > in compaction throughput limit is almost linearly reflected by the repair > run time. > > The default 16 MB/s is too limiting for any production grade setup, I > believe. We go as high as 90 MB/s on AWS EBS gp2 data volumes. But don't > take it as a gospel, I'd suggest you start increasing the setting (e.g. by > doubling it) and observe how it affects repair performance (and client > latencies). > > Have you tried with "parallel" instead of "DC parallel" mode? The latter > one is really poorly named and it actually means something else, as neatly > highlighted in this SO answer: https://dba.stackexchange.com/a/175028 > > Last, but not least: are you using the default number of vnodes, 256? The > overhead of large number of vnodes (times the number of nodes), can be > quite significant. We've seen major improvements in repair runtime after > switching from 256 to 16 vnodes on Cassandra version 3.0. > > Cheers, > -- > Alex > > --000000000000eea2bc058b33648b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Last, bu= t not least: are you using the default number of vnodes, 256?=C2=A0 The ove= rhead of large number of vnodes (times the number of nodes), can be quite s= ignificant.=C2=A0 We've seen major improvements in repair runtime after= switching from 256 to 16 vnodes on Cassandra version 3.0.

Is there a recommended procedure to switch the amount of vnodes ?

Regards,

Leo

<= div class=3D"gmail_quote">
On Thu, Jun= 13, 2019 at 12:06 PM Oleksandr Shulgin <oleksandr.shulgin@zalando.de> wrote:
On Thu, Jun 13, 2019 at 10:36 AM R. T. <rastrent@protonmail.com.inva= lid> wrote:

Well, actually by running cfs= tats I can see that the=C2=A0totaldiskspaceused is about ~ 1.2 TB per node = in the DC1 and ~ 1 TB per node in DC2. DC2 was off for a while thats why th= ere is a difference in space.=C2=A0

I am using= Cassandra 3.0.6 and my=C2=A0stream_throughput_outbound_megabits_per_sec is= th4e default setting so according to my version is (200 Mbps or 25 MB/s)

And the other setting:=C2=A0compac= tion_throughput_mb_per_sec?=C2=A0 It is also highly relevant for repair per= formance, as streamed in files need to be compacted with the existing files= on the nodes.=C2=A0 In our experience change in compaction throughput limi= t is almost linearly reflected by the repair run time.

=
The default 16 MB/s is too limiting for any production grade setup, I = believe.=C2=A0 We go as high as 90 MB/s on AWS EBS gp2 data volumes.=C2=A0 = But don't take it as a gospel, I'd suggest you start increasing the= setting (e.g. by doubling it) and observe how it affects repair performanc= e (and client latencies).

Have you tried with &quo= t;parallel" instead of "DC parallel" mode?=C2=A0 The latter = one is really poorly named and it actually means something else, as neatly = highlighted in this SO answer:=C2=A0https://dba.stackexchange.com/a/175028

Last, but not least: are you using the default numbe= r of vnodes, 256?=C2=A0 The overhead of large number of vnodes (times the n= umber of nodes), can be quite significant.=C2=A0 We've seen major impro= vements in repair runtime after switching from 256 to 16 vnodes on Cassandr= a version 3.0.

Cheers,
--
Alex
<= div>
--000000000000eea2bc058b33648b--