cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Peters <cassan...@softwareprojects.com>
Subject Re: Cassandra 0.8 Counters Inverted Index?
Date Sun, 02 Oct 2011 19:01:20 GMT
Any ideas?


Thanks,
Mike Peters

On 10/1/2011 1:19 AM, Mike Peters wrote:
>
>
> Hi,
>
> We're using Cassandra 0.8 counters in production and loving it!
>
> One issue we're running into is we need an efficient mechanism to 
> retrieve the "top 100" results, sorted by count values.
>
> We have tens of thousands of counters growing rapidly (one counter per 
> each combination of date.source_id).  What we're looking for is, 
> what's the best way to retrieve the top 100 "sources" for a given 
> date, without having to iterate through all counters created for that 
> date?
>
> Right now to accomplish this, we are managing an inverted index of 
> count values.  This is very inefficient and kills our write 
> performance, because after every counter-increment, we have to read 
> its value and store it into an inverted index that looks like this:
>
> Key,   CounterName
> 000005 2011-10-01.source1
> 000009 2011-10-01.source2
> 000010 2011-10-01.source3
>
> If source2 just generated 100 "hits", we need to delete the row with 
> the key of "000009" from the inverted index and insert a new one with 
> the new counter value for source2:
>
> Key,   CounterName
> 000005 2011-10-01.source1
> 000010 2011-10-01.source3
> 000109 2011-10-01.source2
>
> The additional reads and deletes are killing our performance.
>
> Any one has any ideas about a more efficient way to utilize counters 
> and support "top 100" results?
>
> Looking forward to any ideas and feedback you can share.
>
>
> Thanks,
> Mike Peters
>


Mime
View raw message