Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 2E12F200D3D for ; Mon, 13 Nov 2017 15:41:59 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 2C81B160BF3; Mon, 13 Nov 2017 14:41:59 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id F3E4D160BE4 for ; Mon, 13 Nov 2017 15:41:57 +0100 (CET) Received: (qmail 67987 invoked by uid 500); 13 Nov 2017 14:41:56 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 67974 invoked by uid 99); 13 Nov 2017 14:41:56 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Nov 2017 14:41:56 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 789531A0309 for ; Mon, 13 Nov 2017 14:41:55 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.98 X-Spam-Level: * X-Spam-Status: No, score=1.98 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=thelastpickle-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id Wg31-eOIhkuG for ; Mon, 13 Nov 2017 14:41:53 +0000 (UTC) Received: from mail-qk0-f180.google.com (mail-qk0-f180.google.com [209.85.220.180]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 541825F640 for ; Mon, 13 Nov 2017 14:41:52 +0000 (UTC) Received: by mail-qk0-f180.google.com with SMTP id 136so9084892qkd.4 for ; Mon, 13 Nov 2017 06:41:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=thelastpickle-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=K7GFuYNcV4kGQ8MbNx0BdqRHgwd8MAh6NgMwA66rufY=; b=Mfp7rOA5qpmPoH7GTvVUb0trd9//zdffWF6hp681SmuuIm+G8Ipsp6cAFEsMpuQxxv aepngFz/HKmXPr6FlXcpnSbWH/AlhqZFAe3CYuwl24FsO4fN2S/oKFDRqbzAgr/ISeFN 6JDEV1UkkdaaGFnBAa7QZPQLAiKdNYu1I27XYt2Zt2zc99p5IOX8RKHGnCd2MuiFcY7h ruaJDe3bqEYQ/rL3dYQkmpKgrqQldWKWLfxKcVU6z23aypRLCx8AZLQvL7ipChJ8X/UM yxgjNWubEkbrkTFjOB+u4n7bffM0bKZrckzXT0n9tGS+WcuNZRTwWq95NHnZzVzVYELk EoFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=K7GFuYNcV4kGQ8MbNx0BdqRHgwd8MAh6NgMwA66rufY=; b=VGHZN1GW9GrBNjVbVceQCwVFbZyxf1XYPzi0vmRyyadY3BhGPRUVFNhfTLSJjSKSjF AbXm9nFyJ/ZZj4tuALTshjANOqE+I22f6e6+quUYy0bE6uW1tdsIZN+d+IKoePZLg0Rq ylM2lqf8A4ERJdrYKoFEbExhwkLRfYHiwJL6iLf+KscZuUx4f3B9v3c5xY+b3ubighva X5GCG6vM+K2Ip65yBqkV8IMtuwzFfN5F/Yw3b6GTcVlL2nXoQrWHACisfSTaGre/H9RX td3/wJZdr9vLSJLlnKnNz1/zDO/SHxQiDV9mK9GWf0Jbau6LhHylBc4/rt75wBhj+gVh 4FDQ== X-Gm-Message-State: AJaThX7q/7KZXL/ghDNXxMFHhdPKczXDyq6uq9rtwt9zTGoSdTp/WIkj vM2SVxPNqICvWO0gpMBO16xrM6+LzvQ58XG+jPVlOlEE X-Google-Smtp-Source: AGs4zMbFByj6U8AvJd9n0FSBHffpCECzEz+kJI9zt0G2m+GbTBek+mKvctez6MIYRDygm8SMWjh5bh5OHY9oGGEyRjw= X-Received: by 10.55.7.132 with SMTP id 126mr5048520qkh.70.1510584110992; Mon, 13 Nov 2017 06:41:50 -0800 (PST) MIME-Version: 1.0 References: <12F5038F-DB56-43FF-8D07-686EDC751ACC@gmail.com> In-Reply-To: <12F5038F-DB56-43FF-8D07-686EDC751ACC@gmail.com> From: Alexander Dejanovski Date: Mon, 13 Nov 2017 14:41:40 +0000 Message-ID: Subject: Re: STCS leaving sstables behind To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary="001a114c23e056e401055dde45ee" archived-at: Mon, 13 Nov 2017 14:41:59 -0000 --001a114c23e056e401055dde45ee Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable And actually, full repair with 3.0/3.x would have the same effect (anticompation) unless you're using subrange repair. On Mon, Nov 13, 2017 at 3:28 PM Jeff Jirsa wrote: > Running incremental repair puts sstables into a =E2=80=9Crepaired=E2=80= =9D set (and an > unrepaired set), which results in something similar to what you=E2=80=99r= e > describing. > > Were you running / did you run incremental repair ? > > > -- > Jeff Jirsa > > > On Nov 13, 2017, at 5:04 AM, Nicolas Guyomar > wrote: > > Hi everyone, > > I'm facing quite a strange behavior on STCS on 3.0.13, the strategy seems > to have "forgotten" about old sstables, and started a completely new cycl= e > from scratch, leaving the old sstables on disk untouched : > > Something happened on Nov 10 on every node, which resulted in all those > sstables left behind : > > -rw-r--r--. 8 cassandra cassandra 15G Nov 9 22:22 mc-4828-big-Data.db > -rw-r--r--. 8 cassandra cassandra 4.8G Nov 10 01:39 mc-4955-big-Data.db > -rw-r--r--. 8 cassandra cassandra 2.4G Nov 10 01:45 mc-4957-big-Data.db > -rw-r--r--. 8 cassandra cassandra 662M Nov 10 01:47 mc-4959-big-Data.db > -rw-r--r--. 8 cassandra cassandra 2.8G Nov 10 03:46 mc-5099-big-Data.db > -rw-r--r--. 8 cassandra cassandra 4.6G Nov 10 03:58 mc-5121-big-Data.db > -rw-r--r--. 7 cassandra cassandra 53M Nov 10 08:45 mc-5447-big-Data.db > -rw-r--r--. 7 cassandra cassandra 219M Nov 10 08:46 mc-5454-big-Data.db > -rw-r--r--. 7 cassandra cassandra 650M Nov 10 08:46 mc-5452-big-Data.db > -rw-r--r--. 7 cassandra cassandra 1.2G Nov 10 08:48 mc-5458-big-Data.db > -rw-r--r--. 7 cassandra cassandra 1.5G Nov 10 08:50 mc-5465-big-Data.db > -rw-r--r--. 7 cassandra cassandra 504M Nov 10 09:39 mc-5526-big-Data.db > -rw-r--r--. 7 cassandra cassandra 57M Nov 10 09:40 mc-5527-big-Data.db > -rw-r--r--. 7 cassandra cassandra 101M Nov 10 09:41 mc-5532-big-Data.db > -rw-r--r--. 7 cassandra cassandra 86M Nov 10 09:41 mc-5533-big-Data.db > -rw-r--r--. 7 cassandra cassandra 134M Nov 10 09:42 mc-5537-big-Data.db > -rw-r--r--. 7 cassandra cassandra 3.9G Nov 10 09:54 mc-5538-big-Data.db > *-rw-r--r--. 7 cassandra cassandra 1.3G Nov 10 09:57 mc-5548-big-Data.d= b* > -rw-r--r--. 6 cassandra cassandra 16G Nov 11 01:23 mc-6474-big-Data.db > -rw-r--r--. 4 cassandra cassandra 17G Nov 12 06:44 mc-7898-big-Data.db > -rw-r--r--. 3 cassandra cassandra 8.2G Nov 12 13:45 mc-8226-big-Data.db > -rw-r--r--. 2 cassandra cassandra 6.8G Nov 12 22:38 mc-8581-big-Data.db > -rw-r--r--. 2 cassandra cassandra 6.1G Nov 13 03:10 mc-8937-big-Data.db > -rw-r--r--. 2 cassandra cassandra 3.1G Nov 13 04:12 mc-9019-big-Data.db > -rw-r--r--. 2 cassandra cassandra 3.0G Nov 13 05:56 mc-9112-big-Data.db > -rw-r--r--. 2 cassandra cassandra 1.2G Nov 13 06:14 mc-9138-big-Data.db > -rw-r--r--. 2 cassandra cassandra 1.1G Nov 13 06:27 mc-9159-big-Data.db > -rw-r--r--. 2 cassandra cassandra 1.2G Nov 13 06:46 mc-9182-big-Data.db > -rw-r--r--. 1 cassandra cassandra 1.9G Nov 13 07:18 mc-9202-big-Data.db > -rw-r--r--. 1 cassandra cassandra 353M Nov 13 07:22 mc-9207-big-Data.db > -rw-r--r--. 1 cassandra cassandra 120M Nov 13 07:22 mc-9208-big-Data.db > -rw-r--r--. 1 cassandra cassandra 100M Nov 13 07:23 mc-9209-big-Data.db > -rw-r--r--. 1 cassandra cassandra 67M Nov 13 07:25 mc-9210-big-Data.db > -rw-r--r--. 1 cassandra cassandra 51M Nov 13 07:25 mc-9211-big-Data.db > -rw-r--r--. 1 cassandra cassandra 73M Nov 13 07:27 mc-9212-big-Data.db > > > TRACE logs for the Compaction Manager shows that sstables before Nov 10 > are grouped in different buckets than the one after Nov 10. > > At first I thought off some coldness behavior that would filter those > "old" sstables, but looking at the code > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apach= e/cassandra/db/compaction/SizeTieredCompactionStrategy.java#L237 > I don't see any coldness or time pattern used to create bucket. > > I tried restarting the node but the buckets are still grouping in 2 > "groups" splitted around Nov 10 > > I may have missed sthg from the logs, but they are clear from error/warn > at that Nov 10 time > > For what it's worth, restarting the node fixed nodetool status from > reporting a wrong Load (nearly 2TB per node instead =C3=A0 300Gb) =3D> we= are > loading some data for a week now, it seems that this can happen sometimes > > If anyone ever experienced that kind of behavior I'd be glad to know > whether it is OK or not, I'd like to avoid manually triggering JMX > UserDefinedCompaction ;) > > Thank you > > -- ----------------- Alexander Dejanovski France @alexanderdeja Consultant Apache Cassandra Consulting http://www.thelastpickle.com --001a114c23e056e401055dde45ee Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
And actually, full repair with 3.0/3.x would have the same= effect (anticompation) unless you're using subrange repair.

<= div class=3D"gmail_quote">
On Mon, Nov 13, 2017 at 3:28 PM = Jeff Jirsa <jjirsa@gmail.com>= wrote:
Running i= ncremental repair puts sstables into a =E2=80=9Crepaired=E2=80=9D set (and = an unrepaired set), which results in something similar to what you=E2=80=99= re describing.

Were you running / did you run incrementa= l repair ?


--=C2=A0
Jeff Jirsa

<= /div>

On Nov 13, 2017, at 5:04 A= M, Nicolas Guyomar <nicolas.guyomar@gmail.com> wrote:

Hi everyone,

I= 9;m facing quite a strange behavior on STCS on 3.0.13, the strategy seems t= o have "forgotten" about old sstables, and started a completely n= ew cycle from scratch, leaving the old sstables on disk untouched :=C2=A0

Something happened on Nov 10 on every node, whi= ch resulted in all those sstables left behind :=C2=A0

<= div>
-rw-r--r--.=C2=A0 8 cassandra cassandra=C2=A0 =C2=A015G Nov=C2=A0 = 9 22:22 mc-4828-big-Data.db
-rw-r--r--.=C2=A0 8 cassandra cassand= ra=C2=A0 4.8G Nov 10 01:39 mc-4955-big-Data.db
-rw-r--r--.=C2=A0 = 8 cassandra cassandra=C2=A0 2.4G Nov 10 01:45 mc-4957-big-Data.db
-rw-r--r--.=C2=A0 8 cassandra cassandra=C2=A0 662M Nov 10 01:47 mc-4959-bi= g-Data.db
-rw-r--r--.=C2=A0 8 cassandra cassandra=C2=A0 2.8G Nov = 10 03:46 mc-5099-big-Data.db
-rw-r--r--.=C2=A0 8 cassandra cassan= dra=C2=A0 4.6G Nov 10 03:58 mc-5121-big-Data.db
-rw-r--r--.=C2=A0= 7 cassandra cassandra=C2=A0 =C2=A053M Nov 10 08:45 mc-5447-big-Data.db
-rw-r--r--.=C2=A0 7 cassandra cassandra=C2=A0 219M Nov 10 08:46 mc-5= 454-big-Data.db
-rw-r--r--.=C2=A0 7 cassandra cassandra=C2=A0 650= M Nov 10 08:46 mc-5452-big-Data.db
-rw-r--r--.=C2=A0 7 cassandra = cassandra=C2=A0 1.2G Nov 10 08:48 mc-5458-big-Data.db
-rw-r--r--.= =C2=A0 7 cassandra cassandra=C2=A0 1.5G Nov 10 08:50 mc-5465-big-Data.db
-rw-r--r--.=C2=A0 7 cassandra cassandra=C2=A0 504M Nov 10 09:39 mc-= 5526-big-Data.db
-rw-r--r--.=C2=A0 7 cassandra cassandra=C2=A0 = =C2=A057M Nov 10 09:40 mc-5527-big-Data.db
-rw-r--r--.=C2=A0 7 ca= ssandra cassandra=C2=A0 101M Nov 10 09:41 mc-5532-big-Data.db
-rw= -r--r--.=C2=A0 7 cassandra cassandra=C2=A0 =C2=A086M Nov 10 09:41 mc-5533-b= ig-Data.db
-rw-r--r--.=C2=A0 7 cassandra cassandra=C2=A0 134M Nov= 10 09:42 mc-5537-big-Data.db
-rw-r--r--.=C2=A0 7 cassandra cassa= ndra=C2=A0 3.9G Nov 10 09:54 mc-5538-big-Data.db
-rw-r--r--.= =C2=A0 7 cassandra cassandra=C2=A0 1.3G Nov 10 09:57 mc-5548-big-Data.db
-rw-r--r--.=C2=A0 6 cassandra cassandra=C2=A0 =C2=A016G Nov 11 = 01:23 mc-6474-big-Data.db
-rw-r--r--.=C2=A0 4 cassandra cassandra= =C2=A0 =C2=A017G Nov 12 06:44 mc-7898-big-Data.db
-rw-r--r--.=C2= =A0 3 cassandra cassandra=C2=A0 8.2G Nov 12 13:45 mc-8226-big-Data.db
=
-rw-r--r--.=C2=A0 2 cassandra cassandra=C2=A0 6.8G Nov 12 22:38 mc-858= 1-big-Data.db
-rw-r--r--.=C2=A0 2 cassandra cassandra=C2=A0 6.1G = Nov 13 03:10 mc-8937-big-Data.db
-rw-r--r--.=C2=A0 2 cassandra ca= ssandra=C2=A0 3.1G Nov 13 04:12 mc-9019-big-Data.db
-rw-r--r--.= =C2=A0 2 cassandra cassandra=C2=A0 3.0G Nov 13 05:56 mc-9112-big-Data.db
-rw-r--r--.=C2=A0 2 cassandra cassandra=C2=A0 1.2G Nov 13 06:14 mc-= 9138-big-Data.db
-rw-r--r--.=C2=A0 2 cassandra cassandra=C2=A0 1.= 1G Nov 13 06:27 mc-9159-big-Data.db
-rw-r--r--.=C2=A0 2 cassandra= cassandra=C2=A0 1.2G Nov 13 06:46 mc-9182-big-Data.db
-rw-r--r--= .=C2=A0 1 cassandra cassandra=C2=A0 1.9G Nov 13 07:18 mc-9202-big-Data.db
-rw-r--r--.=C2=A0 1 cassandra cassandra=C2=A0 353M Nov 13 07:22 mc= -9207-big-Data.db
-rw-r--r--.=C2=A0 1 cassandra cassandra=C2=A0 1= 20M Nov 13 07:22 mc-9208-big-Data.db
-rw-r--r--.=C2=A0 1 cassandr= a cassandra=C2=A0 100M Nov 13 07:23 mc-9209-big-Data.db
-rw-r--r-= -.=C2=A0 1 cassandra cassandra=C2=A0 =C2=A067M Nov 13 07:25 mc-9210-big-Dat= a.db
-rw-r--r--.=C2=A0 1 cassandra cassandra=C2=A0 =C2=A051M Nov = 13 07:25 mc-9211-big-Data.db
-rw-r--r--.=C2=A0 1 cassandra cassan= dra=C2=A0 =C2=A073M Nov 13 07:27 mc-9212-big-Data.db


TRACE logs for the Compaction Manager shows that ss= tables before Nov 10 are grouped in different buckets than the one after No= v 10.

At first I thought off some coldness behavio= r that would filter those "old" sstables, but looking at the code= https://github.com/apache/cassandra/blob/cassandra-3.0/s= rc/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.jav= a#L237 I don't see any coldness or time pattern used to create buck= et.

I tried restarting the node but the buckets ar= e still grouping in 2 "groups" splitted around Nov 10
<= br>
I may have missed sthg from the logs, but they are clear from= error/warn at that Nov 10 time

For what it's = worth, restarting the node fixed nodetool status from reporting a wrong Loa= d (nearly 2TB per node instead =C3=A0 300Gb) =3D> we are loading some da= ta for a week now, it seems that this can happen sometimes

If anyone ever experienced that kind of behavior I'd be glad t= o know whether it is OK or not, I'd like to avoid manually triggering J= MX UserDefinedCompaction ;)=C2=A0

Thank you=C2=A0= =C2=A0

--
<= /div>
-----------------
Alexander Dejanovski
France
<= div style=3D"font-family:"helvetica neue",helvetica,arial,sans-se= rif;line-height:19.5px">@alexanderdeja

Consultant
Apach= e Cassandra Consulting
--001a114c23e056e401055dde45ee--