From user-return-61854-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org Fri Aug 3 18:03:27 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 26CE9180647 for ; Fri, 3 Aug 2018 18:03:26 +0200 (CEST) Received: (qmail 83370 invoked by uid 500); 3 Aug 2018 16:03:25 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 83359 invoked by uid 99); 3 Aug 2018 16:03:25 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Aug 2018 16:03:25 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id E0645C1D14 for ; Fri, 3 Aug 2018 16:03:24 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.869 X-Spam-Level: * X-Spam-Status: No, score=1.869 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id sYxptnPpilkY for ; Fri, 3 Aug 2018 16:03:23 +0000 (UTC) Received: from mail-it0-f65.google.com (mail-it0-f65.google.com [209.85.214.65]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 5F98D5F478 for ; Fri, 3 Aug 2018 16:03:22 +0000 (UTC) Received: by mail-it0-f65.google.com with SMTP id d10-v6so8983343itj.5 for ; Fri, 03 Aug 2018 09:03:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=EDg9hru23me73wfbD/vveF52DW2PqsLqB0WmhgROV/g=; b=o3KLwnMXydfIY9siCTABfo+XdYVmhsXEwzzqRUEIrSJFih7pvxKRYe6HuxS/F51Z6X wzZwGYadxYBuzTZVq+nNEhI4ZDNA8Ay5oevRjn0zehryGe9fB55ciKyyb+sc7K194aPY RJeiscrgtuJunvfuicA7fF4sxeBwXwK6LA5c3ERkb5TPkLKnRmjFWbWLSJT4sm+kUrjh rqyx/B+gQiIOUu4yKdiWlUN3SmWbu2lMeEca4vu/vkqb06rE6WjNT3Uy3KmSE5wBMZDz uctrtWUu+yIVDffhrnmyc1Is80uQIyL9b2shTPyI2Kk00Vju9pFXNCkHo3DbvSmdZx+E nB6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=EDg9hru23me73wfbD/vveF52DW2PqsLqB0WmhgROV/g=; b=C3YOPdaLXe2thNThVmRWf4cuu3kI3CKuAtUbSRIpFAKyP+98U2Q7Lm14Q/mPo1M7V/ gnrdqEjACpEHMdCox9957RjUBmfQ2KAZm4pbV6SGFbtZ1IN2Kzsd7arSOvi89NYuf2Zr VveJVjlwvKz0X8Zetg5KtufQWtMRVx+7jNdtCUgKtWA6C/43HtNNorizZbnrzZl7P3MO 6cumuUPteK6QUu58K5aWlSj3TE9S7pwJ/JiaLcv+ZKDP21WO+aFhgLpqsf2942PlyCKp ydmMVGvpj7th2nfWAZ7KmFzOOhKRAYMcA/X3plsMD1sQcNu/5YjKbFp3YPSH7IUZLviv 2GBg== X-Gm-Message-State: AOUpUlEkQ38k+jbick3I8j5P6AuPT0sqsP5yrojC/bui/WfqQg44PXLR L8VSeJAw/OjdS2o7t50YWsEIC5Si/MSxrtc46Nampu0= X-Google-Smtp-Source: AAOMgpcLIyl7ST7f9KhfJ0/QVsnjrcGI2nNDcJ0tEigrUttj/8HyEXjfIDUnaZ0EjF8VlaKokv2vgltWyfsTqLAr71s= X-Received: by 2002:a02:6624:: with SMTP id k36-v6mr3917746jac.42.1533312200935; Fri, 03 Aug 2018 09:03:20 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:ac0:e0c7:0:0:0:0:0 with HTTP; Fri, 3 Aug 2018 09:03:20 -0700 (PDT) In-Reply-To: References: From: Mihai Stanescu Date: Fri, 3 Aug 2018 18:03:20 +0200 Message-ID: Subject: Re: Cassandra rate dropping over long term test To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary="00000000000011105005728a11b5" --00000000000011105005728a11b5 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I looked at the compaction history on the affected node when it was affected and it was not affected. The number of compactions is fairly similar and the amount of work also. *Not affected time* [root@cassandra7 ~]# nodetool compactionhistory | grep 02T22 fda43ca0-9696-11e8-8efb-25b020ed0402 demodb topic_message 2018-08-02T22:59:47.946 433124864 339496194 {1:3200576, 2:2025936, 3:262919} 8a83e2c0-9696-11e8-8efb-25b020ed0402 demodb topic_message 2018-08-02T22:56:34.796 133610579 109321990 {1:1574352, 2:434814} 01811e20-9696-11e8-8efb-25b020ed0402 demodb topic_message 2018-08-02T22:52:44.930 132847372 108175388 {1:1577164, 2:432508} *Experiencing more ioread* [root@cassandra7 ~]# nodetool compactionhistory | grep 03T12 389aa220-970c-11e8-8efb-25b020ed0402 demodb topic_message 2018-08-03T12:58:57.986 470326446 349948622 {1:2590960, 2:2600102, 3:298369} 81fe6f10-970b-11e8-8efb-25b020ed0402 demodb topic_message 2018-08-03T12:53:51.617 143850880 115555226 {1:1686260, 2:453627} ce418e30-970a-11e8-8efb-25b020ed0402 demodb topic_message 2018-08-03T12:48:50.067 147035600 119201638 {1:1742318, 2:452226} During a read operation the row can mostly be in one sstable since was only inserted and then read so its strange. We have a partition key and then a clustering key. Rows that are written should be in kernel buffers and the rows which are lost to delete are never read again either so the kernel should have only the most recent data. I remain puzzled On Fri, Aug 3, 2018 at 3:58 PM, Jeff Jirsa wrote: > Probably Compaction > > Cassandra data files are immutable > > The write path first appends to a commitlog, then puts data into the > memtable. When the memtable hits a threshold, it=E2=80=99s flushed to dat= a files on > disk (let=E2=80=99s call the first one =E2=80=9C1=E2=80=9D, second =E2=80= =9C2=E2=80=9D and so on) > > Over time we build up multiple data files on disk - when Cassandra reads, > it will merge data in those files to give you the result you expect, > choosing the latest value for each column > > But it=E2=80=99s usually wasteful to lots of files around, and that mergi= ng is > expensive, so compaction combines those data files behind the scenes in a > background thread. > > By default they=E2=80=99re combined when 4 or more files are approximatel= y the > same size, so if your write rate is such that you fill and flush the > memtable every 5 minutes, compaction will likely happen at least every 20 > minutes (sometimes more). This is called size tiered compaction; there ar= e > 4 strategies but size tiered is default and easiest to understand. > > You=E2=80=99re seeing mostly writes because the reads are likely in page = cache > (the kernel doesn=E2=80=99t need to go to disk to read the files, it=E2= =80=99s got them in > memory for serving normal reads). > > -- > Jeff Jirsa > > > > On Aug 3, 2018, at 12:30 AM, Mihai Stanescu > wrote: > > > > Hi all, > > > > I am perftesting cassandra over a longrun in a cluster of 8 nodes and i > noticed the rate of service drops. > > Most of the nodes have the CPU between 40-65% however one of the nodes > has a higher CPU and also started performing a lot of read IOPS as seen i= n > the image. (green is read IOPS) > > > > My test has a mixed rw scenario. > > 1. insert row > > 2. after 60 seconds read row > > 3. delete row. > > > > The rate of inserts is bigger than the rate of deletes so some delete > will not happen. > > > > I have checked the client it it does not accumulate RAM, GC is a > straight line so o don't understand whats going on. > > > > Any hints? > > > > Regards, > > MIhai > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org > For additional commands, e-mail: user-help@cassandra.apache.org > > --00000000000011105005728a11b5 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I looked at the compaction history on the affected node wh= en it was affected and it was not affected.=C2=A0

The nu= mber of compactions is fairly similar and the amount of work also.

Not affected time
[root@cassandra7 ~= ]# nodetool compactionhistory | grep 02T22
fda43ca0-9696-11e8-8ef= b-25b020ed0402 demodb =C2=A0 =C2=A0 =C2=A0 =C2=A0topic_message =C2=A0 =C2= =A0 2018-08-02T22:59:47.946 433124864 =C2=A0339496194 =C2=A0{1:3200576, 2:2= 025936, 3:262919} =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0
8a83e2= c0-9696-11e8-8efb-25b020ed0402 demodb =C2=A0 =C2=A0 =C2=A0 =C2=A0topic_mess= age =C2=A0 =C2=A0 2018-08-02T22:56:34.796 133610579 =C2=A0109321990 =C2=A0{= 1:1574352, 2:434814} =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0
01811e20-9696-11e8-8efb-25b020ed0402 dem= odb =C2=A0 =C2=A0 =C2=A0 =C2=A0topic_message =C2=A0 =C2=A0 2018-08-02T22:52= :44.930 132847372 =C2=A0108175388 =C2=A0{1:1577164, 2:432508} =C2=A0
<= /div>

Experiencing more ioread
[ro= ot@cassandra7 ~]# nodetool compactionhistory | grep 03T12
389aa22= 0-970c-11e8-8efb-25b020ed0402 demodb =C2=A0 =C2=A0 =C2=A0 =C2=A0topic_messa= ge =C2=A0 =C2=A0 2018-08-03T12:58:57.986 470326446 =C2=A0349948622 =C2=A0{1= :2590960, 2:2600102, 3:298369} =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0
81fe6f10-970b-11e8-8efb-25b020ed0402 demodb =C2=A0 =C2=A0 =C2=A0 = =C2=A0topic_message =C2=A0 =C2=A0 2018-08-03T12:53:51.617 143850880 =C2=A01= 15555226 =C2=A0{1:1686260, 2:453627} =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0
ce418e30-970a-11e8-8efb-25= b020ed0402 demodb =C2=A0 =C2=A0 =C2=A0 =C2=A0topic_message =C2=A0 =C2=A0 20= 18-08-03T12:48:50.067 147035600 =C2=A0119201638 =C2=A0{1:1742318, 2:452226}= =C2=A0 =C2=A0

During a read operation the r= ow can mostly be in one sstable since was only inserted and then read so it= s strange.

We have a partition key and then a clus= tering key.=C2=A0

Rows that are written should be = in kernel buffers and the rows which are lost to delete are never read agai= n either so the kernel should have only the most recent data.=C2=A0

I remain puzzled



On Fri, Aug 3, 2= 018 at 3:58 PM, Jeff Jirsa <jjirsa@gmail.com> wrote:
Probably Compaction

Cassandra data files are immutable

The write path first appends to a commitlog, then puts data into the memtab= le. When the memtable hits a threshold, it=E2=80=99s flushed to data files = on disk (let=E2=80=99s call the first one =E2=80=9C1=E2=80=9D, second =E2= =80=9C2=E2=80=9D and so on)

Over time we build up multiple data files on disk - when Cassandra reads, i= t will merge data in those files to give you the result you expect, choosin= g the latest value for each column

But it=E2=80=99s usually wasteful to lots of files around, and that merging= is expensive, so compaction combines those data files behind the scenes in= a background thread.

By default they=E2=80=99re combined when 4 or more files are approximately = the same size, so if your write rate is such that you fill and flush the me= mtable every 5 minutes, compaction will likely happen at least every 20 min= utes (sometimes more). This is called size tiered compaction; there are 4 s= trategies but size tiered is default and easiest to understand.

You=E2=80=99re seeing mostly writes because the reads are likely in page ca= che (the kernel doesn=E2=80=99t need to go to disk to read the files, it=E2= =80=99s got them in memory for serving normal reads).

--
Jeff Jirsa


> On Aug 3, 2018, at 12:30 AM, Mihai Stanescu <mihai.stanescu@gmail.com> wrote:
>
> Hi all,
>
> I am perftesting cassandra over a longrun in a cluster of 8 nodes and = i noticed the rate of service drops.
> Most of the nodes have the CPU between 40-65% however one of the nodes= has a higher CPU and also started performing a lot of read IOPS as seen in= the image. (green is read IOPS)
>
> My test has a mixed rw scenario.
> 1. insert row
> 2. after 60 seconds read row
> 3. delete row.
>
> The rate of inserts is bigger than the rate of deletes so some delete = will not happen.
>
> I have checked the client it it does not accumulate RAM, GC is a strai= ght line so o don't understand whats going on.
>
> Any hints?
>
> Regards,
> MIhai
>
> <image.png>
>
>

-----------------------------------------------------------------= ----
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


--00000000000011105005728a11b5--