cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jouni Hartikainen <>
Subject Re: Read IO
Date Thu, 21 Feb 2013 16:45:49 GMT


On Feb 21, 2013, at 7:52 , Kanwar Sangha <> wrote:
> Hi – Can someone explain the worst case IOPS for a read ? No key cache, No row cache,
sampling rate say 512.
> 1)      Bloom filter will be checked to see existence of key (In RAM)
> 2)      Index filer sample (IN RAM) will be checked to find approx. location in index
file on disk
> 3)      1 IOPS to read the actual index file on disk (DISK)
> 4)      1 IOPS to get the data from the location in the sstable (DISK)
> Is this correct ?

As you were asking for the worst case, I would still add one step that would be a seek inside
an SSTable from the row start to the queried columns using column index.

However, this applies only if you are querying a subset of columns in the row (not all) and
the total row size exceeds column_index_size_in_kb (defaults to 64kB).

So, as far as I have understood, the worst case steps (without any caches) are:

1. Check the SSTable bloom filters (in memory)
2. Use index samples to find approx. correct place in the key index file (in memory)
3. Read the key index file until correct key is found (1st disk seek & read)
5. Seek to the start of the row in SSTable file and read row headers (possibly including column
index) (2nd seek & read)
6. Using column index seek to the correct place inside the SSTable file to actually read the
columns (3rd seek & read)

If the row is very wide and you are asking for a random bunch of columns from here and there,
the step 6 might even be needed multiple times. Also, if your row has spread over many SSTables,
each of them needs to be accessed (at least steps 1. - 5.) to get the complete results for
the query.

All this in mind, if your node has any reasonable amount of reads, I'd say that in practice
key index files will be page cached by the OS very quickly and thus normal read would end
up being either one seek (for small rows without the column index) or two (for wider rows).
Of course, as Peter already pointed out, the more columns you ask for, the more disk needs
to read. For a continuous set of columns the read should be linear, however.

View raw message