cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DuyHai Doan <doanduy...@gmail.com>
Subject Re: Time series data model and tombstones
Date Sun, 29 Jan 2017 19:38:52 GMT
Ok so give it a try with TWCS. Since STCS does not sort data based on
timestamp, your wide partition may span over multiple SSTables and inside
each SSTable, old data (+ tombstones) may sit on the same partition as
newer data.

When reading by slice, even if you request for fresh data, Cassandra has to
scan over a lot tombstones to fetch the correct range of data thus your
issue

On Sun, Jan 29, 2017 at 8:19 PM, John Sanda <john.sanda@gmail.com> wrote:

> It was with STCS. It was on a 2.x version before TWCS was available.
>
> On Sun, Jan 29, 2017 at 10:58 AM DuyHai Doan <doanduyhai@gmail.com> wrote:
>
>> Did you get this Overwhelming tombstonne behavior with STCS or with TWCS ?
>>
>> If you're using DTCS, beware of its weird behavior and tricky
>> configuration.
>>
>> On Sun, Jan 29, 2017 at 3:52 PM, John Sanda <john.sanda@gmail.com> wrote:
>>
>> Your partitioning key is text. If you have multiple entries per id you
>> are likely hitting older cells that have expired. Descending only affects
>> how the data is stored on disk, if you have to read the whole partition to
>> find whichever time you are querying for you could potentially hit
>> tombstones in other SSTables that contain the same "id". As mentioned
>> previously, you need to add a time bucket to your partitioning key and
>> definitely use DTCS/TWCS.
>>
>>
>> As I mentioned previously, the UI only queries recent data, e.g., the
>> past hour, past two hours, past day, past week. The UI does not query for
>> anything older than the TTL which is 7 days. My understanding and
>> expectation was that Cassandra would only scan live cells. The UI is a
>> separate application that I do not maintain, so I am not 100% certain about
>> the queries. I have been told that it does not query for anything older
>> than 7 days.
>>
>> On Sun, Jan 29, 2017 at 4:14 AM, kurt greaves <kurt@instaclustr.com>
>> wrote:
>>
>>
>> Your partitioning key is text. If you have multiple entries per id you
>> are likely hitting older cells that have expired. Descending only affects
>> how the data is stored on disk, if you have to read the whole partition to
>> find whichever time you are querying for you could potentially hit
>> tombstones in other SSTables that contain the same "id". As mentioned
>> previously, you need to add a time bucket to your partitioning key and
>> definitely use DTCS/TWCS.
>>
>>
>>
>>
>>
>> --
>>
>> - John
>>
>>
>>
>>
>>
>>
>>
>>

Mime
View raw message