From user-return-61904-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org Wed Aug 8 03:13:48 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 01955180657 for ; Wed, 8 Aug 2018 03:13:47 +0200 (CEST) Received: (qmail 73307 invoked by uid 500); 8 Aug 2018 01:13:46 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 73297 invoked by uid 99); 8 Aug 2018 01:13:46 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Aug 2018 01:13:46 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id D7AD71A2B43 for ; Wed, 8 Aug 2018 01:13:45 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.869 X-Spam-Level: * X-Spam-Status: No, score=1.869 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id Jpv3ZrMos4rH for ; Wed, 8 Aug 2018 01:13:44 +0000 (UTC) Received: from mail-qt0-f172.google.com (mail-qt0-f172.google.com [209.85.216.172]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id F10DE5F3B7 for ; Wed, 8 Aug 2018 01:13:43 +0000 (UTC) Received: by mail-qt0-f172.google.com with SMTP id n6-v6so673446qtl.4 for ; Tue, 07 Aug 2018 18:13:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=xTRZZmMzULHKRwqrq61xE1+yg6XQSdcXdrYVlXmw/mg=; b=rkc9rEI2bYEDAVzoLpgfsRH6tF8MnC+gtIVPf1TWGIVQTLPPJsQ9CmoHhc+cGeuCTD kTMC5/N5vbNz8AZ5tivaU7HcUgTM/FUGmV4wgYpA++fF5HMRS01Y1rGRivBkpr74PTfV A8/LxLKDhuWCr6HX6TuHH6VQppf9sC7kwR3wKToeRojnLblblaj2IKMeVjcEZF/PObsR cDrflJSNz4WXMdyOm6/3utRky+kfiTJG0wzkXtK9Hr65iX4BKTLcaIKUG7syWTozG2n8 fwSAveBMQjDOJnjlGkI6MkrfQp8ZN/3pNHs3OfB06VB7Q/DcOksks7ZeMJl6JMCq0u3s 8wag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=xTRZZmMzULHKRwqrq61xE1+yg6XQSdcXdrYVlXmw/mg=; b=XBHW3ztKVncqjSIAHP9rx+ZMHXgoctir5KnJ/AYkgFXNdYEiFp8ZbIRpYfwMIvY55I Qr1F5jfibrTx9NaqkBWffwhSAOjRba/3B6T5JTJUE8Z3sk5LGutq7SvPT4OIzec2p6lG W9aBsGKcJuH2nA5NxwoI79XpOphFdXKHB7aBPmiZX/O0Hqqpc6Rz5niTtibQU9Lk2toc r3MCEHdiorX8+KTFiQTVJ2WGWS0AM78EeMC1Lt/EQvn0MfMKsJp5uvoqtsnT0LC5MmMt mhTbpKv8dGmsis8798X1ZUJgNfaBSykjbTjka6tZ4rC23URs4Sm0s9iJOdSUVTswF7IR yVbA== X-Gm-Message-State: AOUpUlFCf4xlOR41S/416S3gZmhAG7VvAnmm9dhTbB6izT2BTIpRhggu 3aL/wHubO/Bt/MI1U8OiIJsKTwlHc5XV7Fwibrju5Uvi X-Google-Smtp-Source: AA+uWPzWCP3gpUgCXxWMif57nMWaMod1VdRLGLwgRS8s9RuNN7LXIptDBi9od5XkDu2X3E5VZqG/wEmdcZwbvzh45c4= X-Received: by 2002:ac8:37c5:: with SMTP id e5-v6mr754569qtc.339.1533690823312; Tue, 07 Aug 2018 18:13:43 -0700 (PDT) MIME-Version: 1.0 References: <01D18021-2C70-48D6-8395-06A80A412E2D@gmail.com> In-Reply-To: <01D18021-2C70-48D6-8395-06A80A412E2D@gmail.com> From: Brian Spindler Date: Tue, 7 Aug 2018 21:13:32 -0400 Message-ID: Subject: Re: TWCS Compaction backed up To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary="000000000000b80b650572e23888" --000000000000b80b650572e23888 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, I spot checked a couple of the files that were ~200MB and the mostly had "Repaired at: 0" so maybe that's not it? -B On Tue, Aug 7, 2018 at 8:16 PM wrote: > Everything is ttl=E2=80=99d > > I suppose I could use sstablemeta to see the repaired bit, could I just > set that to unrepaired somehow and that would fix? > > Thanks! > > On Aug 7, 2018, at 8:12 PM, Jeff Jirsa wrote: > > May be worth seeing if any of the sstables got promoted to repaired - if > so they=E2=80=99re not eligible for compaction with unrepaired sstables a= nd that > could explain some higher counts > > Do you actually do deletes or is everything ttl=E2=80=99d? > > > -- > Jeff Jirsa > > > On Aug 7, 2018, at 5:09 PM, Brian Spindler > wrote: > > Hi Jeff, mostly lots of little files, like there will be 4-5 that are > 1-1.5gb or so and then many at 5-50MB and many at 40-50MB each. > > Re incremental repair; Yes one of my engineers started an incremental > repair on this column family that we had to abort. In fact, the node tha= t > the repair was initiated on ran out of disk space and we ended replacing > that node like a dead node. > > Oddly the new node is experiencing this issue as well. > > -B > > > On Tue, Aug 7, 2018 at 8:04 PM Jeff Jirsa wrote: > >> You could toggle off the tombstone compaction to see if that helps, but >> that should be lower priority than normal compactions >> >> Are the lots-of-little-files from memtable flushes or >> repair/anticompaction? >> >> Do you do normal deletes? Did you try to run Incremental repair? >> >> -- >> Jeff Jirsa >> >> >> On Aug 7, 2018, at 5:00 PM, Brian Spindler >> wrote: >> >> Hi Jonathan, both I believe. >> >> The window size is 1 day, full settings: >> AND compaction =3D {'timestamp_resolution': 'MILLISECONDS', >> 'unchecked_tombstone_compaction': 'true', 'compaction_window_size': '1', >> 'compaction_window_unit': 'DAYS', 'tombstone_compaction_interval': '8640= 0', >> 'tombstone_threshold': '0.2', 'class': >> 'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy'} >> >> >> nodetool tpstats >> >> Pool Name Active Pending Completed Blocked >> All time blocked >> MutationStage 0 0 68582241832 0 >> 0 >> ReadStage 0 0 209566303 0 >> 0 >> RequestResponseStage 0 0 44680860850 0 >> 0 >> ReadRepairStage 0 0 24562722 0 >> 0 >> CounterMutationStage 0 0 0 0 >> 0 >> MiscStage 0 0 0 0 >> 0 >> HintedHandoff 1 1 203 0 >> 0 >> GossipStage 0 0 8471784 0 >> 0 >> CacheCleanupExecutor 0 0 122 0 >> 0 >> InternalResponseStage 0 0 552125 0 >> 0 >> CommitLogArchiver 0 0 0 0 >> 0 >> CompactionExecutor 8 42 1433715 0 >> 0 >> ValidationExecutor 0 0 2521 0 >> 0 >> MigrationStage 0 0 527549 0 >> 0 >> AntiEntropyStage 0 0 7697 0 >> 0 >> PendingRangeCalculator 0 0 17 0 >> 0 >> Sampler 0 0 0 0 >> 0 >> MemtableFlushWriter 0 0 116966 0 >> 0 >> MemtablePostFlush 0 0 209103 0 >> 0 >> MemtableReclaimMemory 0 0 116966 0 >> 0 >> Native-Transport-Requests 1 0 1715937778 0 >> 176262 >> >> Message type Dropped >> READ 2 >> RANGE_SLICE 0 >> _TRACE 0 >> MUTATION 4390 >> COUNTER_MUTATION 0 >> BINARY 0 >> REQUEST_RESPONSE 1882 >> PAGED_RANGE 0 >> READ_REPAIR 0 >> >> >> On Tue, Aug 7, 2018 at 7:57 PM Jonathan Haddad wrote= : >> >>> What's your window size? >>> >>> When you say backed up, how are you measuring that? Are there pending >>> tasks or do you just see more files than you expect? >>> >>> On Tue, Aug 7, 2018 at 4:38 PM Brian Spindler >>> wrote: >>> >>>> Hey guys, quick question: >>>> >>>> I've got a v2.1 cassandra cluster, 12 nodes on aws i3.2xl, commit log >>>> on one drive, data on nvme. That was working very well, it's a ts db = and >>>> has been accumulating data for about 4weeks. >>>> >>>> The nodes have increased in load and compaction seems to be falling >>>> behind. I used to get about 1 file per day for this column family, ab= out >>>> ~30GB Data.db file per day. I am now getting hundreds per day at 1mb= - >>>> 50mb. >>>> >>>> How to recover from this? >>>> >>>> I can scale out to give some breathing room but will it go back and >>>> compact the old days into nicely packed files for the day? >>>> >>>> I tried setting compaction throughput to 1000 from 256 and it seemed t= o >>>> make things worse for the CPU, it's configured on i3.2xl with 8 compac= tion >>>> threads. >>>> >>>> -B >>>> >>>> Lastly, I have mixed TTLs in this CF and need to run a repair (I think= ) >>>> to get rid of old tombstones, however running repairs in 2.1 on TWCS c= olumn >>>> families causes a very large spike in sstable counts due to anti-compa= ction >>>> which causes a lot of disruption, is there any other way? >>>> >>>> >>>> >>> >>> -- >>> Jon Haddad >>> http://www.rustyrazorblade.com >>> twitter: rustyrazorblade >>> >> --000000000000b80b650572e23888 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi, I spot checked a couple of the files that were ~200MB = and the mostly had "Repaired at: 0" so maybe that's not it?= =C2=A0

-B


On Tue, Aug 7, 2018 at 8:16 PM <brian.spindler@gmail.com> wrote:
=
Everything is ttl=E2= =80=99d=C2=A0

I suppose I could use sstablemeta to see t= he repaired bit, could I just set that to unrepaired somehow and that would= fix?=C2=A0

Thanks!

On Aug 7, 2018, at 8:12 PM, Jeff Jirsa <jjirsa@gmail.com> wrote:

May be worth seeing if any of the sstables got= promoted to repaired - if so they=E2=80=99re not eligible for compaction w= ith unrepaired sstables and that could explain some higher counts

<= /div>
Do you actually do deletes or is everything ttl=E2=80=99d?
<= div>=C2=A0

--= =C2=A0
Jeff Jirsa


On Aug 7, 2018, at= 5:09 PM, Brian Spindler <brian.spindler@gmail.com> wrote:

Hi Jeff, mostly lots of little fi= les, like there will be 4-5 that are 1-1.5gb or so and then many at 5-50MB = and many at 40-50MB each.=C2=A0 =C2=A0

Re incremental re= pair; Yes one of my engineers started an incremental repair on this column = family that we had to abort.=C2=A0 In fact, the node that the repair was in= itiated on ran out of disk space and we ended replacing that node like a de= ad node.=C2=A0 =C2=A0

Oddly the new node is experi= encing this issue as well.=C2=A0=C2=A0

-B


On Tue, A= ug 7, 2018 at 8:04 PM Jeff Jirsa <jjirsa@gmail.com> wrote:
You could toggle off the tombstone compacti= on to see if that helps, but that should be lower priority than normal comp= actions

Are the lots-of-little-files from memtable flush= es or repair/anticompaction?

Do you do normal dele= tes? Did you try to run Incremental repair? =C2=A0

--=C2=A0
Jeff Jirsa


On Aug 7, 20= 18, at 5:00 PM, Brian Spindler <brian.spindler@gmail.com> wrote:

=
Hi Jonathan, both I be= lieve.=C2=A0=C2=A0

The window size is 1 day, full = settings:=C2=A0
=C2=A0 =C2=A0 AND compaction =3D {'timestamp_= resolution': 'MILLISECONDS', 'unchecked_tombstone_compactio= n': 'true', 'compaction_window_size': '1', '= ;compaction_window_unit': 'DAYS', 'tombstone_compaction_int= erval': '86400', 'tombstone_threshold': '0.2', = 'class': 'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompa= ctionStrategy'}=C2=A0


nodetool = tpstats=C2=A0

Pool Name=C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Active=C2=A0 =C2=A0Pending=C2=A0= =C2=A0 =C2=A0 Completed=C2=A0 =C2=A0Blocked=C2=A0 All time blocked
MutationStage=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A00=C2=A0 =C2=A0 68582241832=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
ReadStage=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2= =A0 =C2=A0 =C2=A0 209566303=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
Re= questResponseStage=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 44680860850=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A00
ReadRepairStage=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00= =C2=A0 =C2=A0 =C2=A0 =C2=A024562722=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
CounterMutationStage=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
MiscStage=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
Hin= tedHandoff=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A01=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 203=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
Gossip= Stage=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2= =A0 8471784=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
CacheCleanupExecu= tor=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 122=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A00
InternalResponseStage=C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0552125=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
CommitLogArchiver=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
Compaction= Executor=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 8=C2=A0 =C2= =A0 =C2=A0 =C2=A0 42=C2=A0 =C2=A0 =C2=A0 =C2=A0 1433715=C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A00
ValidationExecutor=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02521=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
MigrationStage=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0527549=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
AntiEntrop= yStage=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A076= 97=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A00
PendingRangeCalculator=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A017=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00<= /div>
Sampler=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A00
MemtableFlushWriter=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0116966=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
MemtablePostFlush=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0209103=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
MemtableRe= claimMemory=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0116966=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A00
Native-Transport-Requests=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A01=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2= =A01715937778=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00=C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 176262

Mess= age type=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Dropped
READ=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A02
RANGE_SLICE=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
_T= RACE=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A00
MUTATION=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 4390
COUNTER_MUTAT= ION=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
= BINARY=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A00
REQUEST_RESPONSE=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 1882
PAGED_RANGE=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
READ_REPA= IR=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0


On = Tue, Aug 7, 2018 at 7:57 PM Jonathan Haddad <jon@jonhaddad.com> wrote:
What's your window size?
=

When you say backed up, how are you measuring that?=C2= =A0 Are there pending tasks or do you just see more files than you expect?<= /div>

On Tue, Aug 7, 2= 018 at 4:38 PM Brian Spindler <brian.spindler@gmail.com> wrote:
Hey guys, quick question:=C2=A0<= div>=C2=A0
I've got a v2.1 cassandra cluster, 12 nodes on aws= i3.2xl, commit log on one drive, data on nvme.=C2=A0 That was working very= well, it's a ts db and has been accumulating data for about 4weeks.=C2= =A0=C2=A0

The nodes have increased in load and com= paction seems to be falling behind.=C2=A0 I used to get about 1 file per da= y for this column family, about ~30GB Data.db file per day.=C2=A0 I am now = getting hundreds per day at=C2=A0 1mb - 50mb.

How = to recover from this?=C2=A0

I can scale out to giv= e some breathing room but will it go back and compact the old days into nic= ely packed files for the day?=C2=A0 =C2=A0=C2=A0

I= tried setting compaction throughput to 1000 from 256 and it seemed to make= things worse for the CPU, it's configured on i3.2xl with 8 compaction = threads.=C2=A0

-B

Lastly,= I have mixed TTLs in this CF and need to run a repair (I think) to get rid= of old tombstones, however running repairs in 2.1 on TWCS column families = causes a very large spike in sstable counts due to anti-compaction which ca= uses a lot of disruption, is there any other way?=C2=A0=C2=A0



--
Jo= n Haddad
ht= tp://www.rustyrazorblade.com
twitter: rustyrazorblade
--000000000000b80b650572e23888--