Sorry I made a mistake in topics-seen !
When you insert it should be :


Sorry about that,

2011/5/18 openvictor Open <>
I guess you can use the same system, you need two CF for that and I think it's better to use 0.8 because it supports counter :

One CF with UTF8Type called active-topics one CF with UUIDType called topics-seen, then using the same principle :

for each timestampN you create :

For each visit to Topic1 Topic2 Topic1

You create a TimeUUID and you insert
active-topics[topics:timestampN] = {Topic1:whateveryouwant}
and :

active-topics[topics:timestampN] = {Topic2:whateveryouwant}
and :

active-topics[topics:timestampN] = {Topic1:whateveryouwant}
and :

Then when you want to query, you query first all the topics (slice) in active-topics for topics:timestampN and then you get all counts in the topics-seen CF for all topics in active-topics.

Not so simple... By the way it adds overhead compared to a simple counter solution but I think it is far more elegant, but this is just my opinion.


2011/5/18 Aditya Narayan <>
Thanks victor!

Aren't there any good ways by using Cassandra alone ?

On Wed, May 18, 2011 at 11:41 PM, openvictor Open <> wrote:
Have you thought about user another kind of Database, which supports volative content for example ?

I am currently thinking about doing something similar. The best and simplest option at the moment that I can think of is Redis. In redis you have the option of querying keys with wildcards. Your problem can be done by just inserting an UUID into Redis for a certain amount of time ( the best is to tailor this amount of time as an inverse function of the number of keys existing in Redis).

With Redis
What I would do : I cut down time in pieces of X minutes ( 15 minutes, for example by truncating a timestamp). Let timestampN be the timestamp for the period of time ( [N,N+15] ), let Topic1 Topic2 be two topics then :

One or more people will view Topic 1 then Topic2 then again Topic1 in this period of 15 minutes
(HINCRBY is the Increment)
HINCRBY topics:Topic1:timestampN viewcount 1
HINCRBY topics:Topic2:timestampN viewcount 1
HINCRBY topics:Topic1:timestampN viewcount 1

Then you just query in the following way :

MGET topics:*:timestampN

* is the wildcard, you order by viewcount and you have what you are asking for !
This is a simplified version of what you should do but personnally I really like the combination of Cassandra and Redis.


2011/5/18 Aditya Narayan <>
I would arrange for memtable flush period in such a manner that the time period for which these most viewed discussions are generated equals the memtable flush timeperiod, so that the entire row of most viewed discussion on a topic is in one or maximum two memtables/ SST tables.
This would also help minimize several versions of the same column in the row parts in different SST tables.

On Wed, May 18, 2011 at 11:04 PM, Aditya Narayan <> wrote:
For a discussions forum, I need to show a page of most viewed discussions.

For implementing this, I maintain a count of views of a discussion & when this views count of a discussion passes a certain threshold limit, the discussion Id is added to a row of most viewed discussions.

This row of most viewed discussions contains columns with Integer names & values containing serialized lists of Ids of all discussions whose views count equals the Integral name of this column.

Thus if the view count of a discussion increases I'll need to move its 'Id' from serialized list in some column to serialized list in another column whose name represents the updated views count on that discussion.

Thus I can get the most viewed discussions by getting the appropriate no of columns from one end of this Integer sorted row.


I wanted to get feedback from you all, to know if this is a good design.