incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "B. Todd Burruss" <>
Subject Re: cassandra not responding
Date Tue, 16 Mar 2010 20:40:17 GMT
i only anticipate about 2,000,000 hot rows, each with about 4k of data.  
however, we will have a LOT of rows that just aren't used.  right now, 
the data is just one column with a blob of text in it.  but i have new 
data coming in constantly, so not sure how this affects the cache, etc.  
i'm skeptical about using any cache really, and just rely on the OS (as 
you mentioned.)  i've been trying this out to see if there's a 
performance gain somewhere, but i'm not seeing it.

Nathan McCall wrote:
> The cache is a "second-chance FIFO" from this library:
> That sounds like an awful lot of churn given the size of the queue and
> the number of references it might keep for the second-chance stuff.
> How big of a hot data set do you need to maintain? The amount of
> overhead for such a large record set may not buy you anything over
> just relying on the file system cache and turning down the heap size.
> -Nate
> On Tue, Mar 16, 2010 at 1:17 PM, B. Todd Burruss <> wrote:
>> i think i better make sure i understand how the row/key cache works.  i
>> currently have both set to 10%.  so if cassandra needs to read data from an
>> sstable that has 100 million rows, it will cache 10,000,000 rows of data
>> from that sstable?  so if my row is ~4k, then we're looking at ~40gb used by
>> cache?

View raw message