cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ezra Stuetzel <ezra.stuet...@riskiq.net>
Subject Re: large number of pending compactions, sstables steadily increasing
Date Tue, 23 Aug 2016 00:21:32 GMT
Yes, I am using vnodes. Each of our nodes has 256 tokens.

On Mon, Aug 22, 2016 at 2:57 AM, Carlos Alonso <info@mrcalonso.com> wrote:

> Are you using vnodes? I've heard of similar sstable explosion issues when
> operating with vnodes.
>
> Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>
>
> On 20 August 2016 at 22:22, Ezra Stuetzel <ezra.stuetzel@riskiq.net>
> wrote:
>
>> Hey Mark,
>> Yes, there are frequent changes to rows. In fact we re-write each row 5
>> times. 95% of our rows are TTL'ed, but it is the select 5% that aren't
>> TTL'ed that led to not use date tiered compaction. I think the node got
>> into a weird state and I'm not sure how, but it wound up with a lot of
>> sstables and many pending compactions. We did have a 6 node cluster, but we
>> wanted to change the machine type to higher CPU and SSDs. So we
>> bootstrapped 4 new nodes one at a time then removed the original 6 nodes
>> one at a time. A few of these 6 nodes were running OOM so we had to
>> assasinate them (some data loss was acceptable). When I increased the
>> compaction throughput and number compaction executors, I did not see any
>> change in the rate of increase of pending compactions. However I did not
>> look at the number of sstables then. Now, looking at the graphs below,
>> increasing those two settings showed an immediate decline in sstable count,
>> but a delayed dramatic decline (~3 day delay) in pending compactions. All
>> nodes should have the same load so I am hoping it won't occur again. If it
>> doesn't I'll try switching to size tiered or date tiered. Between 8/17 and
>> 8/18 is when I increased the compaction settings. We have about 280GB per
>> node for this table, except this one problematic node had about twice that,
>> but it seems to have recovered that space when the pending compactions
>> dropped off. Graphs for the sstables, pending compactions, and disk space
>> are below which start when the 4 nodes were being bootstrapped.
>>
>> [image: Inline image 2]
>>
>> [image: Inline image 1]
>> [image: Inline image 3]
>>
>> On Fri, Aug 19, 2016 at 11:41 AM, Mark Rose <markrose@markrose.ca> wrote:
>>
>>> Hi Ezra,
>>>
>>> Are you making frequent changes to your rows (including TTL'ed
>>> values), or mostly inserting new ones? If you're only inserting new
>>> data, it's probable using size-tiered compaction would work better for
>>> you. If you are TTL'ing whole rows, consider date-tiered.
>>>
>>> If leveled compaction is still the best strategy, one way to catch up
>>> with compactions is to have less data per partition -- in other words,
>>> use more machines. Leveled compaction is CPU expensive. You are CPU
>>> bottlenecked currently, or from the other perspective, you have too
>>> much data per node for leveled compaction.
>>>
>>> At this point, compaction is so far behind that you'll likely be
>>> getting high latency if you're reading old rows (since dozens to
>>> hundreds of uncompacted sstables will likely need to be checked for
>>> matching rows). You may be better off with size tiered compaction,
>>> even if it will mean always reading several sstables per read (higher
>>> latency than when leveled can keep up).
>>>
>>> How much data do you have per node? Do you update/insert to/delete
>>> rows? Do you TTL?
>>>
>>> Cheers,
>>> Mark
>>>
>>> On Wed, Aug 17, 2016 at 2:39 PM, Ezra Stuetzel <ezra.stuetzel@riskiq.net>
>>> wrote:
>>> > I have one node in my cluster 2.2.7 (just upgraded from 2.2.6 hoping
>>> to fix
>>> > issue) which seems to be stuck in a weird state -- with a large number
>>> of
>>> > pending compactions and sstables. The node is compacting about
>>> 500gb/day,
>>> > number of pending compactions is going up at about 50/day. It is at
>>> about
>>> > 2300 pending compactions now. I have tried increasing number of
>>> compaction
>>> > threads and the compaction throughput, which doesn't seem to help
>>> eliminate
>>> > the many pending compactions.
>>> >
>>> > I have tried running 'nodetool cleanup' and 'nodetool compact'. The
>>> latter
>>> > has fixed the issue in the past, but most recently I was getting OOM
>>> errors,
>>> > probably due to the large number of sstables. I upgraded to 2.2.7 and
>>> am no
>>> > longer getting OOM errors, but also it does not resolve the issue. I
>>> do see
>>> > this message in the logs:
>>> >
>>> >> INFO  [RMI TCP Connection(611)-10.9.2.218] 2016-08-17 01:50:01,985
>>> >> CompactionManager.java:610 - Cannot perform a full major compaction
as
>>> >> repaired and unrepaired sstables cannot be compacted together. These
>>> two set
>>> >> of sstables will be compacted separately.
>>> >
>>> > Below are the 'nodetool tablestats' comparing a normal and the
>>> problematic
>>> > node. You can see problematic node has many many more sstables, and
>>> they are
>>> > all in level 1. What is the best way to fix this? Can I just delete
>>> those
>>> > sstables somehow then run a repair?
>>> >>
>>> >> Normal node
>>> >>>
>>> >>> keyspace: mykeyspace
>>> >>>
>>> >>>     Read Count: 0
>>> >>>
>>> >>>     Read Latency: NaN ms.
>>> >>>
>>> >>>     Write Count: 31905656
>>> >>>
>>> >>>     Write Latency: 0.051713177939359714 ms.
>>> >>>
>>> >>>     Pending Flushes: 0
>>> >>>
>>> >>>         Table: mytable
>>> >>>
>>> >>>         SSTable count: 1908
>>> >>>
>>> >>>         SSTables in each level: [11/4, 20/10, 213/100, 1356/1000,
>>> 306, 0,
>>> >>> 0, 0, 0]
>>> >>>
>>> >>>         Space used (live): 301894591442
>>> >>>
>>> >>>         Space used (total): 301894591442
>>> >>>
>>> >>>
>>> >>>
>>> >>> Problematic node
>>> >>>
>>> >>> Keyspace: mykeyspace
>>> >>>
>>> >>>     Read Count: 0
>>> >>>
>>> >>>     Read Latency: NaN ms.
>>> >>>
>>> >>>     Write Count: 30520190
>>> >>>
>>> >>>     Write Latency: 0.05171286705620116 ms.
>>> >>>
>>> >>>     Pending Flushes: 0
>>> >>>
>>> >>>         Table: mytable
>>> >>>
>>> >>>         SSTable count: 14105
>>> >>>
>>> >>>         SSTables in each level: [13039/4, 21/10, 206/100, 831, 0,
0,
>>> 0,
>>> >>> 0, 0]
>>> >>>
>>> >>>         Space used (live): 561143255289
>>> >>>
>>> >>>         Space used (total): 561143255289
>>> >
>>> > Thanks,
>>> >
>>> > Ezra
>>>
>>
>>
>

Mime
View raw message