Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 45B8ACA97 for ; Mon, 24 Nov 2014 03:38:12 +0000 (UTC) Received: (qmail 78921 invoked by uid 500); 24 Nov 2014 03:38:08 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 78866 invoked by uid 500); 24 Nov 2014 03:38:08 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 78839 invoked by uid 99); 24 Nov 2014 03:38:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Nov 2014 03:38:08 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ngrigoriev@gmail.com designates 209.85.192.180 as permitted sender) Received: from [209.85.192.180] (HELO mail-pd0-f180.google.com) (209.85.192.180) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Nov 2014 03:37:43 +0000 Received: by mail-pd0-f180.google.com with SMTP id p10so8893440pdj.25 for ; Sun, 23 Nov 2014 19:37:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=xwHctReH+X9AXntyb6c5WWYsHgb3bb+AeaNOWl34Grs=; b=0/DUdiI+qcXqjQpeI4tnXpGzLEHGqdouLimLTbmKB5A7nr0uc/cywW58iYHIXR7Oev T9/8Cb61UA38wQ0eFLNCXap6uaGT4CRB6u1/Zy4skE5b5J0XOGCBvCmHiehm1rycS8x6 PWMAhjWtsuy8MFQoQfcu24IL41Vb4Qm8bRc1y4l64K562OUC0fUcY4d7JElJvMAV0QAQ spMaszeLPlZadOcgbCTliqFYDS2eU9S1x7reYDsYWOFxKv7P8LewY3wchiWU/wK0lZc8 yT/fYsl0x6ZSlw+rx7h54hNWnX6VndE+9yGAnCQBK5L0eDZlaJgaNkrqiqZrGo5smtpy 04QQ== MIME-Version: 1.0 X-Received: by 10.67.12.236 with SMTP id et12mr29424020pad.31.1416800261766; Sun, 23 Nov 2014 19:37:41 -0800 (PST) Received: by 10.70.67.72 with HTTP; Sun, 23 Nov 2014 19:37:41 -0800 (PST) In-Reply-To: References: Date: Sun, 23 Nov 2014 22:37:41 -0500 Message-ID: Subject: Re: Compaction Strategy guidance From: Nikolai Grigoriev To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a113329f05203a30508928471 X-Virus-Checked: Checked by ClamAV on apache.org --001a113329f05203a30508928471 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Just to clarify - when I was talking about the large amount of data I really meant large amount of data per node in a single CF (table). LCS does not seem to like it when it gets thousands of sstables (makes 4-5 levels). When bootstraping a new node you'd better enable that option from CASSANDRA-6621 (the one that disables STCS in L0). But it will still be a mess - I have a node that I have bootstrapped ~2 weeks ago. Initially it had 7,5K pending compactions, now it has almost stabilized ad 4,6K. Does not go down. Number of sstables at L0 is over 11K and it is slowly slowly building upper levels. Total number of sstables is 4x the normal amount. Now I am not entirely sure if this node will ever get back to normal life. And believe me - this is not because of I/O, I have SSDs everywhere and 16 physical cores. This machine is barely using 1-3 cores at most of the time. The problem is that allowing STCS fallback is not a good option either - it will quickly result in a few 200Gb+ sstables in my configuration and then these sstables will never be compacted. Plus, it will require close to 2x disk space on EVERY disk in my JBOD configuration...this will kill the node sooner or later. This is all because all sstables after bootstrap end at L0 and then the process slowly slowly moves them to other levels. If you have write traffic to that CF then the number of sstables and L0 will grow quickly - like it happens in my case now. Once something like https://issues.apache.org/jira/browse/CASSANDRA-8301 is implemented it may be better. On Sun, Nov 23, 2014 at 4:53 AM, Andrei Ivanov wrote: > Stephane, > > We are having a somewhat similar C* load profile. Hence some comments > in addition Nikolai's answer. > 1. Fallback to STCS - you can disable it actually > 2. Based on our experience, if you have a lot of data per node, LCS > may work just fine. That is, till the moment you decide to join > another node - chances are that the newly added node will not be able > to compact what it gets from old nodes. In your case, if you switch > strategy the same thing may happen. This is all due to limitations > mentioned by Nikolai. > > Andrei, > > > On Sun, Nov 23, 2014 at 8:51 AM, Servando Mu=C3=B1oz G. > wrote: > > ABUSE > > > > > > > > YA NO QUIERO MAS MAILS SOY DE MEXICO > > > > > > > > De: Nikolai Grigoriev [mailto:ngrigoriev@gmail.com] > > Enviado el: s=C3=A1bado, 22 de noviembre de 2014 07:13 p. m. > > Para: user@cassandra.apache.org > > Asunto: Re: Compaction Strategy guidance > > Importancia: Alta > > > > > > > > Stephane, > > > > As everything good, LCS comes at certain price. > > > > LCS will put most load on you I/O system (if you use spindles - you may > need > > to be careful about that) and on CPU. Also LCS (by default) may fall > back to > > STCS if it is falling behind (which is very possible with heavy writing > > activity) and this will result in higher disk space usage. Also LCS has > > certain limitation I have discovered lately. Sometimes LCS may not be > able > > to use all your node's resources (algorithm limitations) and this reduc= es > > the overall compaction throughput. This may happen if you have a large > > column family with lots of data per node. STCS won't have this > limitation. > > > > > > > > By the way, the primary goal of LCS is to reduce the number of sstables > C* > > has to look at to find your data. With LCS properly functioning this > number > > will be most likely between something like 1 and 3 for most of the read= s. > > But if you do few reads and not concerned about the latency today, most > > likely LCS may only save you some disk space. > > > > > > > > On Sat, Nov 22, 2014 at 6:25 PM, Stephane Legay > > wrote: > > > > Hi there, > > > > > > > > use case: > > > > > > > > - Heavy write app, few reads. > > > > - Lots of updates of rows / columns. > > > > - Current performance is fine, for both writes and reads.. > > > > - Currently using SizedCompactionStrategy > > > > > > > > We're trying to limit the amount of storage used during compaction. > Should > > we switch to LeveledCompactionStrategy? > > > > > > > > Thanks > > > > > > > > > > -- > > > > Nikolai Grigoriev > > (514) 772-5178 > --=20 Nikolai Grigoriev (514) 772-5178 --001a113329f05203a30508928471 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Just to clarify - when I was talking about the large = amount of data I really meant large amount of data per node in a single CF = (table). LCS does not seem to like it when it gets thousands of sstables (m= akes 4-5 levels).

When bootstraping a new node you'd better enab= le that option from CASSANDRA-6621 (the one that disables STCS in L0). But = it will still be a mess - I have a node that I have bootstrapped ~2 weeks a= go. Initially it had 7,5K pending compactions, now it has almost stabilized= ad 4,6K. Does not go down. Number of sstables at L0=C2=A0 is over 11K and = it is slowly slowly building upper levels. Total number of sstables is 4x t= he normal amount. Now I am not entirely sure if this node will ever get bac= k to normal life. And believe me - this is not because of I/O, I have SSDs = everywhere and 16 physical cores. This machine is barely using 1-3 cores at= most of the time. The problem is that allowing STCS fallback is not a good= option either - it will quickly result in a few 200Gb+ sstables in my conf= iguration and then these sstables will never be compacted. Plus, it will re= quire close to 2x disk space on EVERY disk in my JBOD configuration...this = will kill the node sooner or later. This is all because all sstables after = bootstrap end at L0 and then the process slowly slowly moves them to other = levels. If you have write traffic to that CF then the number of sstables an= d L0 will grow quickly - like it happens in my case now.

Once something like https://issues.apache.org/jira/browse/CASSANDRA-8301 is impl= emented it may be better.


On Sun, Nov 23, 2014 at 4:53 AM, Andrei Iv= anov <aivanov@iponweb.net> wrote:
Stephane,

We are having a somewhat similar C* load profile. Hence some comments
in addition Nikolai's answer.
1. Fallback to STCS - you can disable it actually
2. Based on our experience, if you have a lot of data per node, LCS
may work just fine. That is, till the moment you decide to join
another node - chances are that the newly added node will not be able
to compact what it gets from old nodes. In your case, if you switch
strategy the same thing may happen. This is all due to limitations
mentioned by Nikolai.

Andrei,


On Sun, Nov 23, 2014 at 8:51 AM, Servando Mu=C3=B1oz G. <smgesi@gmail.com> wrote:
> ABUSE
>
>
>
> YA NO QUIERO MAS MAILS SOY DE MEXICO
>
>
>
> De: Nikolai Grigoriev [mailto:= ngrigoriev@gmail.com]
> Enviado el: s=C3=A1bado, 22 de noviembre de 2014 07:13 p. m.
> Para: user@cassandra.apac= he.org
> Asunto: Re: Compaction Strategy guidance
> Importancia: Alta
>
>
>
> Stephane,
>
> As everything good, LCS comes at certain price.
>
> LCS will put most load on you I/O system (if you use spindles - you ma= y need
> to be careful about that) and on CPU. Also LCS (by default) may fall b= ack to
> STCS if it is falling behind (which is very possible with heavy writin= g
> activity) and this will result in higher disk space usage. Also LCS ha= s
> certain limitation I have discovered lately. Sometimes LCS may not be = able
> to use all your node's resources (algorithm limitations) and this = reduces
> the overall compaction throughput. This may happen if you have a large=
> column family with lots of data per node. STCS won't have this lim= itation.
>
>
>
> By the way, the primary goal of LCS is to reduce the number of sstable= s C*
> has to look at to find your data. With LCS properly functioning this n= umber
> will be most likely between something like 1 and 3 for most of the rea= ds.
> But if you do few reads and not concerned about the latency today, mos= t
> likely LCS may only save you some disk space.
>
>
>
> On Sat, Nov 22, 2014 at 6:25 PM, Stephane Legay <slegay@looplogic.com>
> wrote:
>
> Hi there,
>
>
>
> use case:
>
>
>
> - Heavy write app, few reads.
>
> - Lots of updates of rows / columns.
>
> - Current performance is fine, for both writes and reads..
>
> - Currently using SizedCompactionStrategy
>
>
>
> We're trying to limit the amount of storage used during compaction= . Should
> we switch to LeveledCompactionStrategy?
>
>
>
> Thanks
>
>
>
>
> --
>
> Nikolai Grigoriev
> (514) 772-= 5178



--
Nikolai Grigoriev
(514) 772-5178
--001a113329f05203a30508928471--