Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Fri, 15 Jan 2016 17:04:39 +0000 (UTC)
From: "Wei Deng (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12863363.1441973028000.123437.1452877479914@Atlassian.JIRA>
In-Reply-To: <JIRA.12863363.1441973028000@Atlassian.JIRA>
References: <JIRA.12863363.1441973028000@Atlassian.JIRA>
 <JIRA.12863363.1441973028037@arcas>
Subject: [jira] [Updated] (CASSANDRA-10306) Splitting SSTables in time,
 deleting and archiving SSTables
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


     [ https://issues.apache.org/jira/browse/CASSANDRA-10306?page=3Dcom.atl=
assian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wei Deng updated CASSANDRA-10306:
---------------------------------
    Labels: dtcs  (was: )

> Splitting SSTables in time, deleting and archiving SSTables
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-10306
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1030=
6
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Antti Nissinen
>              Labels: dtcs
>
> This document is a continuation for [CASSANDRA-10195|https://issues.apach=
e.org/jira/browse/CASSANDRA-10195] and describes some needs to be able spli=
t files in time wise as discussed also in [CASSANDRA-8361|https://issues.ap=
ache.org/jira/browse/CASSANDRA-8361]. Data model is explained shortly, then=
 the practical issues running Cassandra with time series data and needs for=
 the splitting capabilities.
> Data model: (snippet from [CASSANDRA-9644|https://issues.apache.org/jira/=
browse/CASSANDRA-9644)]
> Data is time series data. Data is saved so that one row contains a certai=
n time span of data for a given metric ( 20 days in this case). The row key=
 contains information about the start time of the time span and metrix name=
. Column name gives the offset from the beginning of time span. Column time=
 stamp is set to correspond time stamp when adding together the timestamp f=
rom the row key and the offset (the actual time stamp of data point). Data =
model is analog to KairosDB implementation.
> In the practical application the data is added to real-time into the colu=
mn family. While converting from legacy system old data is pre-loaded in ti=
mely order by faking the timestamp of the column before starting the real-t=
ime data collection. However, there is intermittently a need to insert also=
 older data to the database due to the fact that is has not been available =
in real-time or additional time series are fed in afterward due to unforese=
eable needs.=20
> Adding old  data simultaneously with real-time data will lead to SSTables=
 that are containing data from a time period exceeding the length of the co=
mpaction window (TWCS and DTCS). Therefore SSTables are not behaving in pre=
dictable manner in compaction process.
> Tombstones are masking the data from queries but the release of disk spac=
e requires that SStables containing tombstones would be compacted together =
with SSTables having the original data. While using TWCS or DTCS and writin=
g tombstones with timestamp corresponding the real time SStables containing=
 the original data will not end up to be compacted with SSTables having the=
 tombstone. Even if writing tombstones by faking the timestamps the SSTable=
 should be written apart from the on-going real-time data. Otherwise the SS=
Tables have to be splitted (see later).=20
> TTL is a working method to delete data from column family and releasing d=
isk space in a predictable manner. However, setting the correct TTL is not =
a trivial task. Required TTL might change e.g. due to legislation or the cu=
stomer would like to have a longer lifetime for the data.=20
> The other factor affecting the disk space consumption is the variability =
of the rate how much data is fed to the column family. In certain troublesh=
ooting cases the sample rate can be increased ten fold for a large portion =
of collected time series. This will lead to rapid consumption of disk space=
 and old data has to be deleted / archived in a such manner that disk space=
 will be released in a quick and predictable manner.
> Losing one or more nodes from the cluster and not having a spare hardware=
 will also lead to a situation that data from the lost node has to be repli=
cated again for the remaining nodes. This will lead to increased disk space=
 consumption per node and probably requires some cleaning of older data awa=
y from the active column family.
> All of the above issues could be of course handled just by adding more di=
sk space or nodes to the cluster. In the cloud environment that would a fea=
sible option. In the application sitting in real hardware in isolated envir=
onment this is not a feasible solution due to practical reasons or due to c=
osts. Getting new hardware on sites might take a long time e.g. due to cust=
om regulations.
> In the application domain (time series data collection) the data is not m=
odified after inserting to the column family. There will be only read opera=
tions and deletion / archiving of old data based on the TTL or operator act=
ions.
> The above reasoning will lead to following conclusions and proposals.
> * TWCS and DTCS (with certain modifications) are leading to a well struct=
ured SSTables where tables are organized in timely manner giving opportunit=
ies to manage available disk capacity on nodes. Recovering from repairs wor=
ks also (compaction the flood of small SSTables with larger ones).
> * Being able to effectively split the SStables along a given time line wo=
uld lead to SSTable sets on all nodes that would allow deletion or archivin=
g SSTables. What would be the mechanism to inactivate SSTables during delet=
ion / archiving so that nodes don=E2=80=99t start streaming =E2=80=9Cmissin=
g=E2=80=9D data between nodes (repairs)?
> * Being able to split existing SSTables along multiple timelines determin=
ed by TWCS would allow insertion of older data to the column family that wo=
uld eventually be compacted in desired manner in correct time window. Origi=
nal SSTable would be streamed to several SStables according to time windows=
. In the end empty SSTables would be discarded.
> * Splitting action would be a tool to be executed through the nodetool co=
mmand when needed.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)