cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From S├ębastien Pierre <>
Subject Re: Cassandra to store logs as a list
Date Wed, 20 Jan 2010 21:31:07 GMT
Hi Mark,

The most common query would be basically "get all the logs for this
particular day (and campaign)" or "get all the logs since this particular
time stamp (and campaign)", where everything would be aggregated by
"campaign id" (it's for an ad server).

In this case, would using a key like the following improve balancing:
"campaign:<HEX_PADDED_CAMPAIGN_ID>:<NANOTIMESTAMP>" ? Also, if I add a
prefix (like "campaign:<HEX_PADDED_CAMPAIGN_ID>:"), would the key have to
be UTF8Type instead of TimeUUIDType ?

 -- S├ębastien

2010/1/20 Mark Robson <>

> I think you really want to be using the OrderPreservingPartitioner and
> using time-based keys.
> It depends exactly how you're querying it. All querying use-cases need to
> be taken into account when deciding how to structure your data.
> If you use a time-based key with OPP, typically data become very
> unbalanced, because the balancing algorithm (such as exists) depends on the
> keys continuing to have a similar distribution as when the nodes were
> kickstarted.
> One solution would be to put some other field on the beginning of the key
> that you might wish to use such as account id, customer id, site id, etc, if
> you have sufficient of these to spread the data out evenly (do it in hex and
> zero pad it, of course)
> Mark

View raw message