cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DuyHai Doan <doanduy...@gmail.com>
Subject Challenge with initial data load with TWCS
Date Sat, 28 Sep 2019 20:31:07 GMT
Hello users

TWCS works great for permanent state. It creates SSTables of roughly
fixed size if your insertion rate is pretty constant.

Now the big deal is about the initial load.

Let's say we configure a TWCS with window unit = day and window size =
1, we would have 1 SSTable per day and with TTL = 365 days all data
would expire after 1 year

Now, since the cluster is still empty we need to load data worth of 1
year. If we use TWCS and if the loading takes 7 days, we would have 7
SSTables, each of them aggregating 365/7 worth of annual data. Ideally
we would like TWCS to split these data into 365 distinct SSTables

So my question is: how to manage this scenario ? How to perform an
initial load for a table using TWCS and make the compaction split
nicely the data base on source data timestamp and not insertion
timestamp ?

Regards

Duy Hai DOAN

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


Mime
View raw message