From user-return-61028-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org Tue May 8 01:13:05 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 9F235180648 for ; Tue, 8 May 2018 01:13:04 +0200 (CEST) Received: (qmail 76650 invoked by uid 500); 7 May 2018 23:13:02 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 76640 invoked by uid 99); 7 May 2018 23:13:02 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 May 2018 23:13:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 44319C0286 for ; Mon, 7 May 2018 23:13:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.101 X-Spam-Level: *** X-Spam-Status: No, score=3.101 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, TRACKER_ID=1.102] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=instaclustr-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id N0xrUbEGLZva for ; Mon, 7 May 2018 23:12:58 +0000 (UTC) Received: from mail-qt0-f172.google.com (mail-qt0-f172.google.com [209.85.216.172]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id B10A25F216 for ; Mon, 7 May 2018 23:12:57 +0000 (UTC) Received: by mail-qt0-f172.google.com with SMTP id q13-v6so37888712qtp.4 for ; Mon, 07 May 2018 16:12:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=instaclustr-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=4e/qtyzU/MWkCM1v51oI/JRQSKGCpUaFRAVWzlky7yc=; b=zf9L7S4IgVBLAAqwla7JaQ/kIzWiCUoFT9jiqxc03wJYi9MauBCNf7FAhr8e+zaDch PdpnvXDLAawzas/5AswcsyQhCEYg+eo0WgpeSW54e6rrzlFFYOmOeNtoVYc1YmNPddmp tK8a0/7pmrYTfgxMCxxpgf6atafy4hmvg3NoeCHRiSgiauBFWRD88CmuwWJpBxZf7OiJ Q718Xti9wOLoPlCkI77jDs7q11uyjXTXv+M7R1l7Oohxkfjiz/riPblIhfFLDHxEyhm0 3BPD9CYJZpt0KFe1trRuCPq3rEhLeU+CYtGvDwe5xJWSDDhl8Q2jVz/h5xadzAhufpUw 1opA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=4e/qtyzU/MWkCM1v51oI/JRQSKGCpUaFRAVWzlky7yc=; b=k4at+4jkfsrBvxjpviSuJgsyK1t5ElspuJyppChc0Jo4dnuIx+cmNWLey0RyGHOdp0 XqnjXcNN5JSCXTU/uY/P73Mm46dYGzm0r6L9dmfryZJBJjrLkevjaKMbKYxy5A79jz8S INYX8A1ZMO6vun+hVMq06DkNx7kZbWdATWeseNAAkkleb6ZkPFBSWzZ8sia8zrhIScrL fKOTYdzNjihDn1fkMU/cNz7uyn8HlNDiwjsQ4M5/ysalOIJMRaAJbfrw3D4AbeQX04+u Z2xuJf/hTTl0e6jO+NuFik/PlttB+UAFOHBotPW+9nNOWsPPVJAc4BHAL2tR9QpU0EBI lMPQ== X-Gm-Message-State: ALQs6tBns6P6JiEep3gkkBUe9e7jjRs+elmE1o5qGLweEqFvSEAWO19C xj0cEG8GqePJ+FOpLi/LkqQUoOOuuvqM/TGEcw5iGI6x X-Google-Smtp-Source: AB8JxZphsswqr+M706c7wCs7Bf4TZ1bfHlIOY+bUbok/erugDw6pzafSoen6QT1FOqlIPXBno+uekyVMxchU1uqIyE0= X-Received: by 2002:a0c:e485:: with SMTP id n5-v6mr19611803qvl.74.1525734771264; Mon, 07 May 2018 16:12:51 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: kurt greaves Date: Mon, 07 May 2018 23:12:40 +0000 Message-ID: Subject: Re: compaction: huge number of random reads To: User Content-Type: multipart/alternative; boundary="00000000000010274e056ba5cf97" --00000000000010274e056ba5cf97 Content-Type: text/plain; charset="UTF-8" If you've got small partitions/small reads you should test lowering your compression chunk size on the table and disabling read ahead. This sounds like it might just be a case of read amplification. On Tue., 8 May 2018, 05:43 Kyrylo Lebediev, wrote: > Dear Experts, > > > I'm observing strange behavior on a cluster 2.1.20 during compactions. > > > My setup is: > > 12 nodes m4.2xlarge (8 vCPU, 32G RAM) Ubuntu 16.04, 2T EBS gp2. > > Filesystem: XFS, blocksize 4k, device read-ahead - 4k > > /sys/block/vxdb/queue/nomerges = 0 > > SizeTieredCompactionStrategy > > > After data loads when effectively nothing else is talking to the cluster > and compactions is the only activity, I see something like this: > $ iostat -dkx 1 > ... > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > xvda 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > xvdb 0.00 0.00 4769.00 213.00 19076.00 26820.00 > 18.42 7.95 1.17 1.06 3.76 0.20 100.00 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > xvda 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > xvdb 0.00 0.00 6098.00 177.00 24392.00 22076.00 > 14.81 6.46 1.36 0.96 15.16 0.16 100.00 > > Writes are fine: 177 writes/sec <-> ~22Mbytes/sec, > > But for some reason compactions generate a huge number of small reads: > 6098 reads/s <-> ~24Mbytes/sec. ===> Read size is 4k > > > Why instead much smaller amount of large reads I'm getting huge number of > 4k reads instead? > > What could be the reason? > > Thanks, > > Kyrill > > > --00000000000010274e056ba5cf97 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
If you've got small partitions/small reads you should= test lowering your compression chunk size on the table and disabling read = ahead. This sounds like it might just be a case of read amplification.
On Tue., 8 May 2018, 05:43= Kyrylo Lebediev, <Kyrylo_Le= bediev@epam.com> wrote:

Dear Experts,


I'm observing strange behavio= r on a cluster 2.1.20 during compactions.


My setup is:

12 nodes=C2=A0 m4.2xlarge (8 vCPU= , 32G RAM) Ubuntu 16.04, 2T EBS gp2.

Filesystem: XFS, blocksize 4k, de= vice read-ahead - 4k

/sys/block/vxdb/queue/nomerges = =3D 0

SizeTieredCompactionStrategy


After data loads when effectively= nothing else is talking to the cluster and compactions is the only activit= y, I see something like this:

Device:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 rrqm/s=C2=A0= =C2=A0 wrqm/s=C2=A0=C2=A0=C2=A0=C2=A0 r/s=C2=A0=C2=A0=C2=A0=C2=A0 w/s=C2=A0= =C2=A0=C2=A0 rkB/s=C2=A0=C2=A0=C2=A0 wkB/s avgrq-sz avgqu-sz=C2=A0=C2=A0 aw= ait r_await w_await=C2=A0 svctm=C2=A0 %util
6098.00=C2=A0 177.00 24392.00 22076.00=C2=A0=C2=A0=C2=A0 14.81=C2=A0=C2=A0=C2=A0=C2=A0 6.46=C2=A0= =C2=A0=C2=A0 1.36=C2=A0=C2=A0=C2=A0 0.96=C2=A0=C2=A0 15.16=C2=A0=C2=A0 0.16= 100.00

Writes are fine: 177 writes/sec <-> ~22Mbytes/sec,

But for some reason compactions generate a huge number of small reads:
6098 reads/s <-> ~24Mbytes/sec.=C2=A0 =3D=3D=3D> =C2=A0 Read size = is 4k


Why instead much smaller amount o= f large reads I'm getting huge number of 4k reads instead?

What could be the reason?


Thanks,

Kyrill


--00000000000010274e056ba5cf97--