cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aditya Narayan <ady...@gmail.com>
Subject Re: Design for 'Most viewed Discussions' in a forum
Date Wed, 18 May 2011 18:42:29 GMT
Thanks victor!

Aren't there any good ways by using Cassandra alone ?

On Wed, May 18, 2011 at 11:41 PM, openvictor Open <openvictor@gmail.com>wrote:

> Have you thought about user another kind of Database, which supports
> volative content for example ?
>
> I am currently thinking about doing something similar. The best and
> simplest option at the moment that I can think of is Redis. In redis you
> have the option of querying keys with wildcards. Your problem can be done by
> just inserting an UUID into Redis for a certain amount of time ( the best is
> to tailor this amount of time as an inverse function of the number of keys
> existing in Redis).
>
> *With Redis*
> What I would do : I cut down time in pieces of X minutes ( 15 minutes, for
> example by truncating a timestamp). Let timestampN be the timestamp for the
> period of time ( [N,N+15] ), let Topic1 Topic2 be two topics then :
>
> One or more people will view Topic 1 then Topic2 then again Topic1 in this
> period of 15 minutes
> (HINCRBY is the Increment)
> H <http://redis.io/commands/hincrby>INCRBY<http://redis.io/commands/hincrby>
topics:Topic1:timestampN
> viewcount 1
> H <http://redis.io/commands/hincrby>INCRBY<http://redis.io/commands/hincrby>
topics:Topic2:timestampN
> viewcount 1
> H <http://redis.io/commands/hincrby>INCRBY<http://redis.io/commands/hincrby>
topics:Topic1:timestampN
> viewcount 1
>
> Then you just query in the following way :
>
> MGET <http://redis.io/commands/mget> topics:*:timestampN
>
> * is the wildcard, you order by viewcount and you have what you are asking
> for !
> This is a simplified version of what you should do but personnally I really
> like the combination of Cassandra and Redis.
>
>
> Victor
>
> 2011/5/18 Aditya Narayan <adynnn@gmail.com>
>
>> I would arrange for memtable flush period in such a manner that the time
>> period for which these most viewed discussions are generated equals the
>> memtable flush timeperiod, so that the entire row of most viewed discussion
>> on a topic is in one or maximum two memtables/ SST tables.
>> This would also help minimize several versions of the same column in the
>> row parts in different SST tables.
>>
>>
>>
>> On Wed, May 18, 2011 at 11:04 PM, Aditya Narayan <adynnn@gmail.com>wrote:
>>
>>> *************
>>> For a discussions forum, I need to show a page of most viewed
>>> discussions.
>>>
>>> For implementing this, I maintain a count of views of a discussion & when
>>> this views count of a discussion passes a certain threshold limit, the
>>> discussion Id is added to a row of most viewed discussions.
>>>
>>> This row of most viewed discussions contains columns with Integer names &
>>> values containing serialized lists of Ids of all discussions whose views
>>> count equals the Integral name of this column.
>>>
>>> Thus if the view count of a discussion increases I'll need to move its
>>> 'Id' from serialized list in some column to serialized list in another
>>> column whose name represents the updated views count on that discussion.
>>>
>>> Thus I can get the most viewed discussions by getting the appropriate no
>>> of columns from one end of this Integer sorted row.
>>>
>>> ************
>>>
>>> I wanted to get feedback from you all, to know if this is a good design.
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Mime
View raw message