cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Jirsa (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-9666) Provide an alternative to DTCS
Date Fri, 26 Jun 2015 23:46:06 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jeff Jirsa updated CASSANDRA-9666:
----------------------------------
    Description: 
DTCS is great for time series data, but it comes with caveats that make it difficult to use
in production (typical operator behaviors such as bootstrap, removenode, and repair have MAJOR
caveats as they relate to max_sstable_age_days, and hints/read repair break the selection
algorithm).

I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices the tiered nature
of DTCS in order to address some of DTCS' operational shortcomings. I believe it is necessary
to propose an alternative rather than simply adjusting DTCS, because it fundamentally removes
the tiered nature in order to remove the parameter max_sstable_age_days - the result is very
very different, even if it is heavily inspired by DTCS. 

Specifically, rather than creating a number of windows of ever increasing sizes, this strategy
allows an operator to choose the window size, compact with STCS within the first window of
that size, and aggressive compact down to a single sstable once that window is no longer current.
The window size is a combination of unit (minutes, hours, days) and size (1, etc), such that
an operator can expect all data using a block of that size to be compacted together (that
is, if your unit is hours, and size is 6, you will create roughly 4 sstables per day, each
one containing roughly 6 hours of data). 

The result addresses a number of the problems with DateTieredCompactionStrategy:

- At the present time, DTCS’s first window is compacted using an unusual selection criteria,
which prefers files with earlier timestamps, but ignores sizes. In TimeWindowCompactionStrategy,
the first window data will be compacted with the well tested, fast, reliable STCS. All STCS
options can be passed to TimeWindowCompactionStrategy to configure the first window’s compaction
behavior.

- HintedHandoff may put old data in new sstables, but it will have little impact other than
slightly reduced efficiency (sstables will cover a wider range, but the old timestamps will
not impact sstable selection criteria during compaction)

- ReadRepair may put old data in new sstables, but it will have little impact other than slightly
reduced efficiency (sstables will cover a wider range, but the old timestamps will not impact
sstable selection criteria during compaction)

- Small, old sstables resulting from streams of any kind will be swiftly and aggressively
compacted with the other sstables matching their similar maxTimestamp, without causing sstables
in neighboring windows to grow in size.

- The configuration options are explicit and straightforward - the tuning parameters leave
little room for error. The window is set in common, easily understandable terms such as “12
hours”, “1 Day”, “30 days”. The minute/hour/day options are granular enough for
users keeping data for hours, and users keeping data for years. 

- There is no explicitly configurable max sstable age, though sstables will naturally stop
compacting once new data is written in that window. 

- Streaming operations can create sstables with old timestamps, and they'll naturally be joined
together with sstables in the same time bucket. This is true for bootstrap/repair/sstableloader/removenode.


- It remains true that if old data and new data is written into the memtable at the same time,
the resulting sstables will be treated as if they were new sstables, however, that no longer
negatively impacts the compaction strategy’s selection criteria for older windows. 

Patch provided for both 2.1 ( https://github.com/jeffjirsa/cassandra/commits/twcs-2.1 ) and
2.2 ( https://github.com/jeffjirsa/cassandra/commits/twcs )



  was:
DTCS is great for time series data, but it comes with caveats that make it difficult to use
in production (typical operator behaviors such as bootstrap, removenode, and repair have MAJOR
caveats as they relate to max_sstable_age_days, and hints/read repair break the selection
algorithm).

I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices the tiered nature
of DTCS in order to address some of DTCS' operational shortcomings. I believe it is necessary
to propose an alternative rather than simply adjusting DTCS, because it fundamentally removes
the tiered nature in order to remove the parameter max_sstable_age_days - the result is very
very different, even if it is heavily inspired by DTCS. 

Specifically, rather than creating a number of windows of ever increasing sizes, this strategy
allows an operator to choose the window size, compact with STCS within the first window of
that size, and aggressive compact down to a single sstable once that window is no longer current.
The window size is a combination of unit (minutes, hours, days) and size (1, etc), such that
an operator can expect all data using a block of that size to be compacted together (that
is, if your unit is hours, and size is 6, you will create roughly 4 sstables per day, each
one containing roughly 6 hours of data). 

The result addresses a number of the problems with DateTieredCompactionStrategy:

- At the present time, DTCS’s first window is compacted using an unusual selection criteria,
which prefers files with earlier timestamps, but ignores sizes. In TimeWindowCompactionStrategy,
the first window data will be compacted with the well tested, fast, reliable STCS. All STCS
options can be passed to TimeWindowCompactionStrategy to configure the first window’s compaction
behavior.

- HintedHandoff may put old data in new sstables, but it will have little impact other than
slightly reduced efficiency (sstables will cover a wider range, but the old timestamps will
not impact sstable selection criteria during compaction)

- ReadRepair may put old data in new sstables, but it will have little impact other than slightly
reduced efficiency (sstables will cover a wider range, but the old timestamps will not impact
sstable selection criteria during compaction)

- Small, old sstables resulting from streams of any kind will be swiftly and aggressively
compacted with the other sstables matching their similar maxTimestamp, without causing sstables
in neighboring windows to grow in size.

- The configuration options are explicit and straightforward - the tuning parameters leave
little room for error. The window is set in common, easily understandable terms such as “12
hours”, “1 Day”, “30 days”. The minute/hour/day options are granular enough for
users keeping data for hours, and users keeping data for years. 

- There is no explicitly configurable max sstable age, though sstables will naturally stop
compacting once new data is written in that window. 

- It remains true that if old data and new data is written into the memtable at the same time,
the resulting sstables will be treated as if they were new sstables, however, that no longer
negatively impacts the compaction strategy’s selection criteria for older windows. 

Patch provided for both 2.1 ( https://github.com/jeffjirsa/cassandra/commits/twcs-2.1 ) and
2.2 ( https://github.com/jeffjirsa/cassandra/commits/twcs )




> Provide an alternative to DTCS
> ------------------------------
>
>                 Key: CASSANDRA-9666
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9666
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jeff Jirsa
>             Fix For: 2.1.x, 2.2.x
>
>
> DTCS is great for time series data, but it comes with caveats that make it difficult
to use in production (typical operator behaviors such as bootstrap, removenode, and repair
have MAJOR caveats as they relate to max_sstable_age_days, and hints/read repair break the
selection algorithm).
> I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices the tiered
nature of DTCS in order to address some of DTCS' operational shortcomings. I believe it is
necessary to propose an alternative rather than simply adjusting DTCS, because it fundamentally
removes the tiered nature in order to remove the parameter max_sstable_age_days - the result
is very very different, even if it is heavily inspired by DTCS. 
> Specifically, rather than creating a number of windows of ever increasing sizes, this
strategy allows an operator to choose the window size, compact with STCS within the first
window of that size, and aggressive compact down to a single sstable once that window is no
longer current. The window size is a combination of unit (minutes, hours, days) and size (1,
etc), such that an operator can expect all data using a block of that size to be compacted
together (that is, if your unit is hours, and size is 6, you will create roughly 4 sstables
per day, each one containing roughly 6 hours of data). 
> The result addresses a number of the problems with DateTieredCompactionStrategy:
> - At the present time, DTCS’s first window is compacted using an unusual selection
criteria, which prefers files with earlier timestamps, but ignores sizes. In TimeWindowCompactionStrategy,
the first window data will be compacted with the well tested, fast, reliable STCS. All STCS
options can be passed to TimeWindowCompactionStrategy to configure the first window’s compaction
behavior.
> - HintedHandoff may put old data in new sstables, but it will have little impact other
than slightly reduced efficiency (sstables will cover a wider range, but the old timestamps
will not impact sstable selection criteria during compaction)
> - ReadRepair may put old data in new sstables, but it will have little impact other than
slightly reduced efficiency (sstables will cover a wider range, but the old timestamps will
not impact sstable selection criteria during compaction)
> - Small, old sstables resulting from streams of any kind will be swiftly and aggressively
compacted with the other sstables matching their similar maxTimestamp, without causing sstables
in neighboring windows to grow in size.
> - The configuration options are explicit and straightforward - the tuning parameters
leave little room for error. The window is set in common, easily understandable terms such
as “12 hours”, “1 Day”, “30 days”. The minute/hour/day options are granular enough
for users keeping data for hours, and users keeping data for years. 
> - There is no explicitly configurable max sstable age, though sstables will naturally
stop compacting once new data is written in that window. 
> - Streaming operations can create sstables with old timestamps, and they'll naturally
be joined together with sstables in the same time bucket. This is true for bootstrap/repair/sstableloader/removenode.

> - It remains true that if old data and new data is written into the memtable at the same
time, the resulting sstables will be treated as if they were new sstables, however, that no
longer negatively impacts the compaction strategy’s selection criteria for older windows.

> Patch provided for both 2.1 ( https://github.com/jeffjirsa/cassandra/commits/twcs-2.1
) and 2.2 ( https://github.com/jeffjirsa/cassandra/commits/twcs )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message