cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suman Ghosh <sumanthew...@gmail.com>
Subject Re: Cache layer in front of cassandra... any help / suggestions?
Date Fri, 15 Jul 2011 16:29:49 GMT
 *>"What's huge? Number of gigs, ballpark."*

Data is in the range of 30-40 GB per calendar day per data source if we
consider usage sources like SWITCH or IN, and in the range of 5-10 GB for
non usage ones like Billing etc. And we use multiple source correlation on
the aforesaid data per day.


>"*The cassandra row-cache is LRU, and the page cache of OS:es is
>"LRU:ish" (but generally you might see evictions at any time when
>unlucky).*"

As it is with telecom data records, even the records which have a high
occurrence (if we measure the stats after a certain period of time - say
EOD) do not always follow a "frequently used" pattern. So, we decided that
we require some sort of "list-based-caching" instead of LRU - so that we
have a control on which ones we actually *want* to keep in memory and which
we dont.


>"*If you use an external cache, keep in mind that you instantly have the
>problem that the cache can become inconsistent with data in Cassandra.*"

Yaa... thats the reason why I'm trying to find out whether Cassandra itself
has some trick to do it (maybe, some sort of configuration/list support for
row-caching - wishful thinking!)

Any suggestions?

-SG.


On Fri, Jul 15, 2011 at 9:39 PM, Peter Schuller <peter.schuller@infidyne.com
> wrote:

> > As we work on telecom data records (voice call/sms/GPRS xDRs), the data
> > volume is simply HUGE, and we definitely need a “controlled” caching
> > mechanism in front of the Cassandra layer.
>
> What's huge? Number of gigs, ballpark.
>
> > By the term  “controlled cache layer”, what I am trying to suggest is
> > something like maybe maintaining a list of most high-usage (and
> therefore,
> > high occurrence) phone numbers somewhere, and the cache layer will hold
> all
> > live data and counters for those numbers in memory. Therefore, all
>
> The cassandra row-cache is LRU, and the page cache of OS:es is
> "LRU:ish" (but generally you might see evictions at any time when
> unlucky).
>
> If you use an external cache, keep in mind that you instantly have the
> problem that the cache can become inconsistent with data in Cassandra.
> You may also want to wait for the off-heap row cache support to be in
> a released version to be more efficient w.r.t. memory usage and GC
> overhead than the normal row caching behavior.
>
> But before asking what the appropriate external cache is, make sure
> you actually do need one first since the lack of guaranteed
> consistency with the Cassandra cluster is usually something that is
> nice to avoid.
>
> --
> / Peter Schuller (@scode on twitter)
>



-- 
Get me at GMail    --> sumanthewhiz[at]gmail[dot]com
... or there's Yahoo --> sumanthewhiz[at]yahoo[dot]com

Mime
View raw message