incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: cassandra disk access
Date Thu, 08 Aug 2013 02:26:53 GMT
Some background on the read and write paths, some of the extra details are a little out of
date but mostly correct in 1.2

http://www.slideshare.net/aaronmorton/cassandra-community-webinar-introduction-to-apache-cassandra-12-20353118/40
http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/
http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/

Cheers

-----------------
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 7/08/2013, at 9:07 PM, Michał Michalski <michalm@opera.com> wrote:

> I'm not sure how accurate it is (it's from 2011, one of its sources is from 2010), but
I'm pretty sure it's more or less OK:
> 
> http://blog.csdn.net/firecoder/article/details/7019435
> 
> M.
> 
> W dniu 07.08.2013 10:34, Nikolay Mihaylov pisze:
>> thanks
>> 
>> It will use the Index Sample (RAM) first, then it will use "full" Index
>> (disk) and finally it will read data from SSTable (disk). There's no such
>> thing like "collision" in this case.
>> 
>> so it still have 2 seeks :)
>> 
>> where I can see the internal structure of the sstable i tried to find it
>> documented but was unable to find anything ?
>> 
>> 
>> 
>> 
>> On Wed, Aug 7, 2013 at 11:27 AM, Michał Michalski <michalm@opera.com> wrote:
>> 
>>> 
>>>  2. when cassandra lookups a key in sstable (assuming bloom-filter and
>>>> other
>>>> "stuff" failed, also assuming the key is located in this single sstable),
>>>> cassandra DO NOT USE sequential I/O. "She" probably will read the
>>>> hash-table slot or similar structure, then cassandra will do another disk
>>>> seek in order to get the value (and probably the key). Also probably there
>>>> will need another seek, if there is key collision there will need
>>>> additional seeks.
>>>> 
>>> 
>>> It will use the Index Sample (RAM) first, then it will use "full" Index
>>> (disk) and finally it will read data from SSTable (disk). There's no such
>>> thing like "collision" in this case.
>>> 
>>> 
>>>  3. once the data (e.g. the row) is located, a sequential read for entire
>>>> row will occur. (Once again I assume there is single well compacted
>>>> sstable). Also if disk is not fragmented, the data will be placed on disk
>>>> sectors one after the other.
>>>> 
>>> 
>>> Yes, this is how I understand it too.
>>> 
>>> M.
>>> 
>>> 
>> 
> 


Mime
View raw message