cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Jirsa <jji...@gmail.com>
Subject Re: Academic paper about Cassandra database compaction
Date Tue, 15 May 2018 07:44:57 GMT
On Mon, May 14, 2018 at 11:04 AM, Lucas Benevides <
lucas@maurobenevides.com.br> wrote:

> Thank you Jeff Jirsa by your comments,
>
> How can we do this:  "fix this by not scheduling the major compaction
> until we know all of the sstables in the window are available to be
> compacted"?
>
>
Would require a change to TWCS itself. Right here where we grab
not-currently-compacting sstables (
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L110
), we'd also grab the compacting set, and if the candidate sstables for the
task overlapped with the same window as the compacting sstables (respecting
repaired/unrepaired/pending-repaired sets), then we'd skip compacting until
the previous compactions finished.


> About the column-family schema, I had to customize the cassandra-stress
> tool so that it could create a reasonable number of rows per partition. In
> the default behavior it keeps creating repeated clustering keys for each
> partition, and so most data get updated instead of inserted.
>

A similar customization may be useful to create partitions that are
narrowly bucketed into fixed sized time windows (which is a common and
typical schema in IOT use cases).

- Jeff





>
> Lucas B. Dias
>
> 2018-05-14 14:03 GMT-03:00 Jeff Jirsa <jjirsa@gmail.com>:
>
>> Interesting!
>>
>> I suspect I know what the increased disk usage in TWCS, and it's a
>> solvable problem, the problem is roughly something like this:
>> - Window 1 has sstables 1, 2, 3, 4, 5, 6
>> - We start compacting 1, 2, 3, 4 (using STCS-in-TWCS first window)
>> - The TWCS window rolls over
>> - We flush (sstable 7), and trigger the TWCS window major compaction,
>> which starts compacting 5, 6, 7 + any other sstable from that window
>> - If the first compaction (1,2,3,4) has finished by the time sstable 7 is
>> flushed, we'll include it's result in that compaction, if it doesn't we'll
>> have to do the major compaction twice to guarantee we have exactly one
>> sstable per window, which will temporarily increase disk space
>>
>> We can likely fix this by not scheduling the major compaction until we
>> know all of the sstables in the window are available to be compacted.
>>
>> Also your data model is probably typical, but not well suited for time
>> series cases - if you find my 2016 Cassandra Summit TWCS talk (it's on
>> youtube), I mention aligning partition keys to TWCS windows, which involves
>> adding a second component to the partition key. This is hugely important in
>> terms of making sure TWCS data expires quickly and avoiding having to read
>> from more than one TWCS window at a time.
>>
>>
>> - Jeff
>>
>>
>>
>> On Mon, May 14, 2018 at 7:12 AM, Lucas Benevides <
>> lucas@maurobenevides.com.br> wrote:
>>
>>> Dear community,
>>>
>>> I want to tell you about my paper published in a conference in March.
>>> The title is " NoSQL Database Performance Tuning for IoT Data -
>>> Cassandra Case Study"  and it is available (not for free) in
>>> http://www.scitepress.org/DigitalLibrary/Link.aspx?doi=10
>>> .5220/0006782702770284 .
>>>
>>> TWCS is used and compared with DTCS.
>>>
>>> I hope you can download it, unfortunately I cannot send copies as the
>>> publisher has its copyright.
>>>
>>> Lucas B. Dias
>>>
>>>
>>>
>>
>

Mime
View raw message