cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei Deng (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-10306) Splitting SSTables in time, deleting and archiving SSTables
Date Fri, 15 Jan 2016 17:04:39 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Wei Deng updated CASSANDRA-10306:
---------------------------------
    Labels: dtcs  (was: )

> Splitting SSTables in time, deleting and archiving SSTables
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-10306
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10306
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Antti Nissinen
>              Labels: dtcs
>
> This document is a continuation for [CASSANDRA-10195|https://issues.apache.org/jira/browse/CASSANDRA-10195]
and describes some needs to be able split files in time wise as discussed also in [CASSANDRA-8361|https://issues.apache.org/jira/browse/CASSANDRA-8361].
Data model is explained shortly, then the practical issues running Cassandra with time series
data and needs for the splitting capabilities.
> Data model: (snippet from [CASSANDRA-9644|https://issues.apache.org/jira/browse/CASSANDRA-9644)]
> Data is time series data. Data is saved so that one row contains a certain time span
of data for a given metric ( 20 days in this case). The row key contains information about
the start time of the time span and metrix name. Column name gives the offset from the beginning
of time span. Column time stamp is set to correspond time stamp when adding together the timestamp
from the row key and the offset (the actual time stamp of data point). Data model is analog
to KairosDB implementation.
> In the practical application the data is added to real-time into the column family. While
converting from legacy system old data is pre-loaded in timely order by faking the timestamp
of the column before starting the real-time data collection. However, there is intermittently
a need to insert also older data to the database due to the fact that is has not been available
in real-time or additional time series are fed in afterward due to unforeseeable needs. 
> Adding old  data simultaneously with real-time data will lead to SSTables that are containing
data from a time period exceeding the length of the compaction window (TWCS and DTCS). Therefore
SSTables are not behaving in predictable manner in compaction process.
> Tombstones are masking the data from queries but the release of disk space requires that
SStables containing tombstones would be compacted together with SSTables having the original
data. While using TWCS or DTCS and writing tombstones with timestamp corresponding the real
time SStables containing the original data will not end up to be compacted with SSTables having
the tombstone. Even if writing tombstones by faking the timestamps the SSTable should be written
apart from the on-going real-time data. Otherwise the SSTables have to be splitted (see later).

> TTL is a working method to delete data from column family and releasing disk space in
a predictable manner. However, setting the correct TTL is not a trivial task. Required TTL
might change e.g. due to legislation or the customer would like to have a longer lifetime
for the data. 
> The other factor affecting the disk space consumption is the variability of the rate
how much data is fed to the column family. In certain troubleshooting cases the sample rate
can be increased ten fold for a large portion of collected time series. This will lead to
rapid consumption of disk space and old data has to be deleted / archived in a such manner
that disk space will be released in a quick and predictable manner.
> Losing one or more nodes from the cluster and not having a spare hardware will also lead
to a situation that data from the lost node has to be replicated again for the remaining nodes.
This will lead to increased disk space consumption per node and probably requires some cleaning
of older data away from the active column family.
> All of the above issues could be of course handled just by adding more disk space or
nodes to the cluster. In the cloud environment that would a feasible option. In the application
sitting in real hardware in isolated environment this is not a feasible solution due to practical
reasons or due to costs. Getting new hardware on sites might take a long time e.g. due to
custom regulations.
> In the application domain (time series data collection) the data is not modified after
inserting to the column family. There will be only read operations and deletion / archiving
of old data based on the TTL or operator actions.
> The above reasoning will lead to following conclusions and proposals.
> * TWCS and DTCS (with certain modifications) are leading to a well structured SSTables
where tables are organized in timely manner giving opportunities to manage available disk
capacity on nodes. Recovering from repairs works also (compaction the flood of small SSTables
with larger ones).
> * Being able to effectively split the SStables along a given time line would lead to
SSTable sets on all nodes that would allow deletion or archiving SSTables. What would be the
mechanism to inactivate SSTables during deletion / archiving so that nodes don’t start streaming
“missing” data between nodes (repairs)?
> * Being able to split existing SSTables along multiple timelines determined by TWCS would
allow insertion of older data to the column family that would eventually be compacted in desired
manner in correct time window. Original SSTable would be streamed to several SStables according
to time windows. In the end empty SSTables would be discarded.
> * Splitting action would be a tool to be executed through the nodetool command when needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message