cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zsolt Pálmai <zpal...@gmail.com>
Subject Re: OOM after a while during compacting
Date Thu, 05 Apr 2018 14:23:45 GMT
Yeah, they are pretty much unique but it's only a few requests per day so
hitting all the nodes would be fine for now.

2018-04-05 15:43 GMT+02:00 Evelyn Smith <u5015159@gmail.com>:

> Not sure if it differs for SASI Secondary Indexes but my understanding is
> it’s a bad idea to use high cardinality columns for Secondary Indexes.
> Not sure what your data model looks like but I’d assume UUID would have
> very high cardinality.
>
> If that’s the case it pretty much guarantees any query on the secondary
> index will hit all the nodes, which is what you want to avoid.
>
> Also Secondary Indexes are generally bad for Cassandra, if you don’t need
> them or there's a way around using them I’d go with that.
>
> Regards,
> Eevee.
>
>
> On 5 Apr 2018, at 11:27 pm, Zsolt Pálmai <zpalmai@gmail.com> wrote:
>
> Tried both (although with the biggest table) and the result is the same.
>
> I stumbled upon this jira issue: https://issues.apache.o
> rg/jira/browse/CASSANDRA-12662
> Since the sasi indexes I use are only helping in debugging (for now) I
> dropped them and it seems the tables get compacted now (at least it made it
> further then before and the jvm metrics look healthy).
>
> Still this is not ideal as it would be nice to have those secondary
> indexes :/ .
>
> The columns I indexed are basically uuids (so I can match the rows from
> other systems but this is usually triggered manually so performance loss is
> acceptable).
> Is there a recommended index to use here? Or setting
> the max_compaction_flush_memory_in_mb value? I saw that it can cause
> different kind of problems... Or the default secondary index?
>
> Thanks
>
>
>
> 2018-04-05 15:14 GMT+02:00 Evelyn Smith <u5015159@gmail.com>:
>
>> Probably a dumb question but it’s good to clarify.
>>
>> Are you compacting the whole keyspace or are you compacting tables one at
>> a time?
>>
>>
>> On 5 Apr 2018, at 9:47 pm, Zsolt Pálmai <zpalmai@gmail.com> wrote:
>>
>> Hi!
>>
>> I have a setup with 4 AWS nodes (m4xlarge - 4 cpu, 16gb ram, 1TB ssd
>> each) and when running the nodetool compact command on any of the servers I
>> get out of memory exception after a while.
>>
>> - Before calling the compact first I did a repair and before that there
>> was a bigger update on a lot of entries so I guess a lot of sstables were
>> created. The reapir created around ~250 pending compaction tasks, 2 of the
>> nodes I managed to finish with upgrading to a 2xlarge machine and twice the
>> heap (but running the compact on them manually also killed one :/ so this
>> isn't an ideal solution)
>>
>> Some more info:
>> - Version is the newest 3.11.2 with java8u116
>> - Using LeveledCompactionStrategy (we have mostly reads)
>> - Heap size is set to 8GB
>> - Using G1GC
>> - I tried moving the memtable out of the heap. It helped but I still got
>> an OOM last night
>> - Concurrent compactors is set to 1 but it still happens and also tried
>> setting throughput between 16 and 128, no changes.
>> - Storage load is 127Gb/140Gb/151Gb/155Gb
>> - 1 keyspace, 16 tables but there are a few SASI indexes on big tables.
>> - The biggest partition I found was 90Mb but that table has only 2
>> sstables attached and compacts in seconds. The rest is mostly 1 line
>> partition with a few 10KB of data.
>> - Worst SSTable case: SSTables in each level: [1, 20/10, 106/100, 15, 0,
>> 0, 0, 0, 0]
>>
>> In the metrics it looks something like this before dying:
>> https://ibb.co/kLhdXH
>>
>> What the heap dump looks like of the top objects: https://ibb.co/ctkyXH
>>
>> The load is usually pretty low, the nodes are almost idling (avg 500
>> reads/sec, 30-40 writes/sec with occasional few second spikes with >100
>> writes) and the pending tasks is also around 0 usually.
>>
>> Any ideas? I'm starting to run out of ideas. Maybe the secondary indexes
>> cause problems? I could finish some bigger compactions where there was no
>> index attached but I'm not sure 100% if this is the cause.
>>
>> Thanks,
>> Zsolt
>>
>>
>>
>>
>>
>
>

Mime
View raw message