incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Counters and Top 10
Date Fri, 23 Dec 2011 09:52:28 GMT
Counters only update the value of the column, they cannot be used as column names. So you cannot
have a dynamically updating top ten list using counters.

You have a couple of options. First use something like redis if that fits your use case. Redis
could either be the database of record for the counts. Or just an aggregation layer, write
the data to cassandra and sorted sets in redis then read the top ten from redis and use cassandra
to rebuild redis if needed. 

The other is to periodically pivot the counts into a top ten row where you use regular integers
for the column name. With only 10K users you could do this with an process that periodically
reads all the users rows or where ever the counters are and updates the aggregate row. Depending
on data size you cold use hive/pig or whatever regular programming language your are happy
with.

I guess you could also use redis to keep the top ten sorted and then periodically dump that
back to cassandra and serve the read traffic from there.  

Hope that helps 


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 23/12/2011, at 3:46 AM, R. Verlangen wrote:

> I would suggest you to create a CF with a single row (or multiple for historical data)
with a date as key (utf8, e.g. 2011-12-22) and multiple columns for every user's score. The
column (utf8) would then be the score + something unique of the user (e.g. hex representation
of the TimeUUID). The value would be the TimeUUID of the user.
> 
> By default columns will be sorted and you can perform a slice to get the top 10.
> 
> 2011/12/14 cbertu81@libero.it <cbertu81@libero.it>
> Hi all,
> I'm using Cassandra in production for a small social network (~10.000 people).
> Now I have to assign some "credits" to each user operation (login, write post
> and so on) and then beeing capable of providing in each moment the top 10 of
> the most active users. I'm on Cassandra 0.7.6 I'd like to migrate to a new
> version in order to use Counters for the user points but ... what about the top
> 10?
> I was thinking about a specific ROW that always keeps the 10 most active users
> ... but I think it would be heavy (to write and to handle in thread-safe mode)
> ... can counters provide something like a "value ordered list"?
> 
> Thanks for any help.
> Best regards,
> 
> Carlo
> 
> 
> 


Mime
View raw message