cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Coli <>
Subject Re: Fetching ONE cell with a row cache hit takes 1 second on an idle box?
Date Wed, 02 Jul 2014 01:40:12 GMT
On Tue, Jul 1, 2014 at 6:06 PM, Kevin Burton <> wrote:

> you know.. one thing I failed to mention.. .is that this is going into a
> "bucket" and while it's a logical row, the physical row is like 500MB …
> according to compaction logs.
> is the ENTIRE physical row going into the cache as one unit?  That's
> definitely going to be a problem in this model.  500MB is a big atomic unit.

Yes, the row cache is a row cache. It caches what the storage engine calls
rows, which CQL calls "partitions." [1] Rows have to be assembled from all
of their row fragments in Memtables/SSTables.

This is a big part of why the "off-heap" row cache's behavior of
invalidation on write is so bad for its overall performance. Updating a
single column in your 500MB row invalidates it and forces you to assemble
the entire 500MB row from disk.

The only valid use case for the current off-heap row cache seems to be :
very small, very uniform in size, very hot, and very rarely modified.

Is the ticket for replacing the row cache and its unexpected
characteristics with something more like an actual query cache.

also.. I assume it's having to do a binary search within the physical row ?

Since the column level bloom filter's removal in 1.2, the only way it can
get to specific columns is via the index.


View raw message