incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Wright <kwri...@nanigans.com>
Subject CQL3 Data Model Question
Date Tue, 07 May 2013 20:02:06 GMT
Hi all,

    I was hoping you could provide some assistance with a data modeling question (my apologies
if a similar question has already been posed).  I have time based data that I need to store
on a per customer (aka app id ) basis so that I can easily return it in sorted order by event
time.  The data in question is being written at high volume (~50K / sec) and I am concerned
about the cardinality of using either app id or event time as the row key as either will likely
result in hot spots.  Here are is the table definition I am considering:

create table organic_events (
event_id UUID,
app_id INT,
event_time TIMESTAMP,
user_id INT,
….
PRIMARY KEY (app_id, event_time, event_id)
)  WITH CLUSTERING ORDER BY (app_id asc,event_time desc);

So that I can be able to query as follows which will naturally sort the results by time descending:

select * from organic_events where app_id = 1234 and event_time <= '2012-01-01' and event_time
> '2012-01-01';

Anyone have an idea of the best way to accomplish this?  I was considering the following:

 *   Making the row key a concatenation of app id and 0-100 using a mod on event id to get
the value.  When getting data I would just fetch all keys given the mods (app_id in (1234_0,1234_1,1234_2,
etc).  This would alleviate the "hot" key issue but still seems expensive and a little hacky
 *   I tried removing app_id from the primary key all together (using primary key of user_id,
event_time, event_id) and making app_id a secondary index.  I would need to sort by time on
the client.  The above query is valid however running a query is VERY slow as I believe it
needs to fetch every row key that matches the index which is quite expensive (I get a timeout
in cqlsh).
 *   Create a different column family for each app id (I.e. 1234_organic_events).  Note that
we could easily have 1000s of application ids.

Thanks!

Mime
View raw message