cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peer, Oded" <>
Subject RE: Interesting use case
Date Thu, 09 Jun 2016 05:14:38 GMT
Why do you think the amount of partitions is different in these tables? The partition key is
the same (system_name and event_name). The number of rows per partition is different.

From: kurt Greaves []
Sent: Thursday, June 09, 2016 7:52 AM
Subject: Re: Interesting use case

I would say it's probably due to a significantly larger number of partitions when using the
overwrite method - but really you should be seeing similar performance unless one of the schemas
ends up generating a lot more disk IO.
If you're planning to read the last N values for an event at the same time the widerow schema
would be better, otherwise reading N events using the overwrite schema will result in you
hitting N partitions. You really need to take into account how you're going to read the data
when you design a schema, not only how many writes you can push through.

On 8 June 2016 at 19:02, John Thomas <<>>
We have a use case where we are storing event data for a given system and only want to retain
the last N values.  Storing extra values for some time, as long as it isn’t too long, is
fine but never less than N.  We can't use TTLs to delete the data because we can't be sure
how frequently events will arrive and could end up losing everything.  Is there any built
in mechanism to accomplish this or a known pattern that we can follow?  The events will be
read and written at a pretty high frequency so the solution would have to be performant and
not fragile under stress.

We’ve played with a schema that just has N distinct columns with one value in each but have
found overwrites seem to perform much poorer than wide rows.  The use case we tested only
required we store the most recent value:

CREATE TABLE eventyvalue_overwrite(
    system_name text,
    event_name text,
    event_time timestamp,
    event_value blob,
    PRIMARY KEY (system_name,event_name))

CREATE TABLE eventvalue_widerow (
    system_name text,
    event_name text,
    event_time timestamp,
    event_value blob,
    PRIMARY KEY ((system_name, event_name), event_time))

We tested it against the DataStax AMI on EC2 with 6 nodes, replication 3, write consistency
2, and default settings with a write only workload and got 190K/s for wide row and 150K/s
for overwrite.  Thinking through the write path it seems the performance should be pretty
similar, with probably smaller sstables for the overwrite schema, can anyone explain the big

The wide row solution is more complex in that it requires a separate clean up thread that
will handle deleting the extra values.  If that’s the path we have to follow we’re thinking
we’d add a bucket of some sort so that we can delete an entire partition at a time after
copying some values forward, on the assumption that deleting the whole partition is much better
than deleting some slice of the partition.  Is that true?  Also, is there any difference between
setting a really short ttl and doing a delete?

I know there are a lot of questions in there but we’ve been going back and forth on this
for a while and I’d really appreciate any help you could give.


Kurt Greaves<><>
View raw message