cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mck <...@apache.org>
Subject Re: best practices for time-series data with massive amounts of records
Date Tue, 03 Mar 2015 13:40:04 GMT
Clint,

> CREATE TABLE events (
>   id text,
>   date text, // Could also use year+month here or year+week or something else
>   event_time timestamp,
>   event blob,
>   PRIMARY KEY ((id, date), event_time))
> WITH CLUSTERING ORDER BY (event_time DESC);
> 
> The downside of this approach is that we can no longer do a simple
> continuous scan to get all of the events for a given user.  Some users
> may log lots and lots of interactions every day, while others may interact
> with our application infrequently, so I'd like a quick way to get the most
> recent interaction for a given user.
> 
> Has anyone used different approaches for this problem?


One idea is to provide additional manual partitioning like…

CREATE TABLE events (
  user_id text,
  partition int,
  event_time timeuuid,
  event_json text,
  PRIMARY KEY ((user_id, partition), event_time)
) WITH
  CLUSTERING ORDER BY (event_time DESC) AND
  compaction={'class': 'DateTieredCompactionStrategy'};


Here "partition" is a random digit from 0 to (N*M) 
where N=nodes in cluster, and M=arbitrary number.

Read performance is going to suffer a little because you need to query
N*M as many partition keys for each read, but should be constant enough
that it comes down to increasing the cluster's hardware and scaling out
as need be.

The multikey reads you can do it with a SELECT…IN query, or better yet
with parallel reads (less pressure on the coordinator at expense of 
extra network calls).

Starting with M=1, you have the option to increase it over time if the
rows in partitions for any users get too high.
(We do¹ something similar for storing all raw events in our enterprise
platform, but because the data is not user-centric the initial partition
key is minute-by-minute timebuckets, and M has remained at 1 the whole
time).

This approach is better than using order-preserving partition (really
don't do that).

I would also consider replacing "event blob" with "event text", choosing
json instead of any binary serialisation. We've learnt the hard way the
value of data transparency, and i'm guessing the storage cost is small
given c* compression.

Otherwise the advice here is largely repeating what Jens has already
said.

~mck

  ¹ slide 19+20 from
  https://prezi.com/vt98oob9fvo4/cassandra-summit-cassandra-and-hadoop-at-finnno/

Mime
View raw message