cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Dejanovski <a...@thelastpickle.com>
Subject Re: Speed up compaction
Date Thu, 13 Jun 2019 12:52:13 GMT
Hi Léo,

Major compactions in LCS (and minor as well) are very slow indeed and I'm
afraid there's not much you can do to speed things up. There are lots of
synchronized sections in the LCS code and it has to do a lot of comparisons
between sstables to make sure a partition won't end up in two sstables of
the same level.
A major compaction will be single threaded for obvious reasons, and while
this is happening you might have all the newly flushed SSTables that will
pile up in S0 since I don't see how Cassandra could achieve the "one
sstable per partition per level except L0" guarantee otherwise.

At this point, your best chance might be to switch the table to STCS, run a
major compaction using the "-s" flag (split output, which will create one
SSTable per size tier instead of a big fat one) and then back to LCS,
before or after your migration (whatever works best for you). If you go
down that path, I'd also recommend to try it up on one node using JMX to
alter the compaction strategy, run the major compaction with nodetool and
see if it's indeed faster than the LCS major compaction. Then, proceed node
by node using JMX (wait for the major compaction to go through between
nodes) and alter the schema only after the last node has been switched to
STCS.
You can use more "aggressive" compaction settings to limit read
fragmentation reducing max_threshold to 3 instead of 4 (the default).

Note that doing all this will impact your cluster performance in ways I
cannot predict, and should be attempted only if you really need to perform
this major compaction and cannot wait for it to go through at the current
pace.

Cheers,

-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


On Thu, Jun 13, 2019 at 2:07 PM Léo FERLIN SUTTON
<lferlin@mailjet.com.invalid> wrote:

> On Thu, Jun 13, 2019 at 12:09 PM Oleksandr Shulgin <
> oleksandr.shulgin@zalando.de> wrote:
>
>> On Thu, Jun 13, 2019 at 11:28 AM Léo FERLIN SUTTON
>> <lferlin@mailjet.com.invalid> wrote:
>>
>>>
>>> ## Cassandra configuration :
>>> 4 concurrent_compactors
>>> Current compaction throughput: 150 MB/s
>>> Concurrent reads/write are both set to 128.
>>>
>>> I have also temporarily stopped every repair operations.
>>>
>>> Any ideas about how I can speed this up ?
>>>
>>
>> Hi,
>>
>> What is the compaction strategy used by this column family?
>>
>> Do you observe this behavior on one of the nodes only?  Have you tried to
>> cancel this compaction and see if a new one is started and makes better
>> progress?  Can you try to restart the affected node?
>>
>> Regards,
>> --
>> Alex
>>
>> I can't believe I forgot that information.
>
>  Overall we are talking about a 1.08TB table, using LCS.
>
> SSTable count: 1047
>> SSTables in each level: [15/4, 10, 103/100, 918, 0, 0, 0, 0, 0]
>
> SSTable Compression Ratio: 0.5192269874287099
>
> Number of partitions (estimate): 7282253587
>
>
> We have recently (about a month ago) deleted about 25% of the data in that
> table.
>
> Letting Cassandra reclaim the disk space on it's own (via regular
> compactions) was too slow for us, so we wanted to force a compaction on the
> table to reclaim the disk space faster.
>
> The speed of the compaction doesn't seem out of the ordinary for the
> cluster, only before we haven't had such a big compaction and the speed
> alarmed us.
> We never have a big compaction backlog, most of the time less than 5
> pending tasks (per node)
>
> Finally but we are running Cassandra 3.0.18 and plan to upgrade to 3.11 as
> soon as our compactions are over.
>
> Regards,
>
> Leo
>

Mime
View raw message