cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Coli <rc...@digg.com>
Subject Re: How come key cache increases speed by x4?
Date Thu, 24 Feb 2011 00:21:43 GMT
On Wed, Feb 23, 2011 at 4:04 PM, buddhasystem <potekhin@bnl.gov> wrote:

> Well I know the cache is there for a reason, I just can't explain the factor
> of 4 when I run my queries on a hot vs cold cache. My queries are actually a
> chain of one on an inverted index, which produces a tuple of keys to be used
> in the "main" query. The inverted index query should be downright trivial.
>
> I see the turnaround time per row go down to 1 ms from 4 ms. Am I missing
> something? Why such a large factor?

(simplified for discussion purposes, not necessarily exhaustive
description of.. )

Path in the cold key cache case :

a) check all bloom filters, 1 per sstable in the CF, which is in memory
b) read the index file (not in memory) and traverse index for every
sstable which returns positive in a)
c) read the actual data file once for every sstable

Path in the hot key cache case :

a) read list of filenames and offsets from key cache
b) read the actual data file

You will notice that the former involves a lot more seeking than the
latter, especially if you have "many" sstables. This seeking almost
certainly is the cause of your observed difference. If you graph I/O
throughput in the two different cases, you will almost certainly see
yourself doing more (slow) I/O in the cold cache case. Memory spent on
key cache is usually relatively well spent, for this reason.

=Rob

Mime
View raw message