From user-return-35824-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Thu Aug 8 02:27:49 2013 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0A85A1011B for ; Thu, 8 Aug 2013 02:27:49 +0000 (UTC) Received: (qmail 94748 invoked by uid 500); 8 Aug 2013 02:27:46 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 94731 invoked by uid 500); 8 Aug 2013 02:27:46 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 94723 invoked by uid 99); 8 Aug 2013 02:27:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Aug 2013 02:27:46 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: local policy) Received: from [209.85.214.174] (HELO mail-ob0-f174.google.com) (209.85.214.174) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Aug 2013 02:27:40 +0000 Received: by mail-ob0-f174.google.com with SMTP id wd6so2250594obb.19 for ; Wed, 07 Aug 2013 19:26:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-gm-message-state:from:content-type:message-id:mime-version :subject:date:references:to:in-reply-to; bh=lMbCevMyoZhmz5/ydrcI4ntIgOd7QbZSnZKbW19XWEg=; b=lsE6Px+PmpzqbcVrzxegU5mwJAb6V4Xg8o4g6bPGiTptIXfVcNqr3AjO/xBka0TVfx 34jA/BKyGrDyhQLMZixnvrlzuqXRRmIs6Avx4edr9IufAFKXy+ZCZtn7yBveATypyiUf VYHQ3Mdc+rthJFrJVAss75PyuCUsT/1VgAFNriBLyqCshWjBHsKQ9BlBY1PZImaTfrb+ 1eeMjARTYhDVOpO1tTRMUxPTYkfxxJEbG6KC/rgJTH4kOwwgGR6vCnOTE+e5VDW7CWWn 2N9xCm3CDKVVqoCBODi4E5qN7iHH9/BaM4vKoHG4twhJRbtFMbS1+cYnwLdQW5PFj5tL sbmA== X-Gm-Message-State: ALoCoQlA8Z4C4TQEPSs8lbf2gcTLJYkNrj3PMF+oGzBQT/sGjhIp1JAjurq1EvfsNtA+xaYNYU0m X-Received: by 10.60.34.130 with SMTP id z2mr2657266oei.87.1375928819318; Wed, 07 Aug 2013 19:26:59 -0700 (PDT) Received: from [172.16.1.7] ([203.86.207.101]) by mx.google.com with ESMTPSA id tv3sm11433188obb.8.2013.08.07.19.26.57 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 07 Aug 2013 19:26:58 -0700 (PDT) From: Aaron Morton Content-Type: multipart/alternative; boundary="Apple-Mail=_BD5372A8-B77A-4DA6-9ABE-D3280ADCCF53" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Subject: Re: cassandra disk access Date: Thu, 8 Aug 2013 14:26:53 +1200 References: <520204D9.2000107@opera.com> <52020E6D.5010502@opera.com> To: user@cassandra.apache.org In-Reply-To: <52020E6D.5010502@opera.com> X-Mailer: Apple Mail (2.1508) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_BD5372A8-B77A-4DA6-9ABE-D3280ADCCF53 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Some background on the read and write paths, some of the extra details = are a little out of date but mostly correct in 1.2 = http://www.slideshare.net/aaronmorton/cassandra-community-webinar-introduc= tion-to-apache-cassandra-12-20353118/40 http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/ http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ Cheers ----------------- Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 7/08/2013, at 9:07 PM, Micha=C5=82 Michalski = wrote: > I'm not sure how accurate it is (it's from 2011, one of its sources is = from 2010), but I'm pretty sure it's more or less OK: >=20 > http://blog.csdn.net/firecoder/article/details/7019435 >=20 > M. >=20 > W dniu 07.08.2013 10:34, Nikolay Mihaylov pisze: >> thanks >>=20 >> It will use the Index Sample (RAM) first, then it will use "full" = Index >> (disk) and finally it will read data from SSTable (disk). There's no = such >> thing like "collision" in this case. >>=20 >> so it still have 2 seeks :) >>=20 >> where I can see the internal structure of the sstable i tried to find = it >> documented but was unable to find anything ? >>=20 >>=20 >>=20 >>=20 >> On Wed, Aug 7, 2013 at 11:27 AM, Micha=C5=82 Michalski = wrote: >>=20 >>>=20 >>> 2. when cassandra lookups a key in sstable (assuming bloom-filter = and >>>> other >>>> "stuff" failed, also assuming the key is located in this single = sstable), >>>> cassandra DO NOT USE sequential I/O. "She" probably will read the >>>> hash-table slot or similar structure, then cassandra will do = another disk >>>> seek in order to get the value (and probably the key). Also = probably there >>>> will need another seek, if there is key collision there will need >>>> additional seeks. >>>>=20 >>>=20 >>> It will use the Index Sample (RAM) first, then it will use "full" = Index >>> (disk) and finally it will read data from SSTable (disk). There's no = such >>> thing like "collision" in this case. >>>=20 >>>=20 >>> 3. once the data (e.g. the row) is located, a sequential read for = entire >>>> row will occur. (Once again I assume there is single well compacted >>>> sstable). Also if disk is not fragmented, the data will be placed = on disk >>>> sectors one after the other. >>>>=20 >>>=20 >>> Yes, this is how I understand it too. >>>=20 >>> M. >>>=20 >>>=20 >>=20 >=20 --Apple-Mail=_BD5372A8-B77A-4DA6-9ABE-D3280ADCCF53 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Some = background on the read and write paths, some of the extra details are a = little out of date but mostly correct in 1.2

<= a = href=3D"http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/">http:/= /thelastpickle.com/2011/07/04/Cassandra-Query-Plans/

Cheers

-----------------
Aaron Morton
Cassandra = Consultant
New = Zealand

@aaronmorton

On 7/08/2013, at 9:07 PM, Micha=C5=82 Michalski <michalm@opera.com> = wrote:

I'm not sure how accurate it is (it's from 2011, one of = its sources is from 2010), but I'm pretty sure it's more or less = OK:

http://blo= g.csdn.net/firecoder/article/details/7019435

M.

W dniu = 07.08.2013 10:34, Nikolay Mihaylov pisze:
thanks

It will use the Index Sample (RAM) first, = then it will use "full" Index
(disk) and finally it will read data = from SSTable (disk). There's no such
thing like "collision" in this = case.

so it still have 2 seeks :)

where I can see the = internal structure of the sstable i tried to find it
documented but = was unable to find anything ?




On Wed, Aug 7, 2013 at = 11:27 AM, Micha=C5=82 Michalski <michalm@opera.com> = wrote:


 2. when cassandra = lookups a key in sstable (assuming bloom-filter and
other
"stuff" failed, also assuming the key is located = in this single sstable),
cassandra DO NOT USE sequential I/O. "She" = probably will read the
hash-table slot or similar structure, then = cassandra will do another disk
seek in order to get the value (and = probably the key). Also probably there
will need another seek, if = there is key collision there will need
additional = seeks.


It will use the Index Sample (RAM) first, = then it will use "full" Index
(disk) and finally it will read data = from SSTable (disk). There's no such
thing like "collision" in this = case.


 3. once the data (e.g. the row) is located, a = sequential read for entire
row will occur. = (Once again I assume there is single well compacted
sstable). Also if = disk is not fragmented, the data will be placed on disk
sectors one = after the other.


Yes, this is how I understand = it = too.

M.




<= /div>
= --Apple-Mail=_BD5372A8-B77A-4DA6-9ABE-D3280ADCCF53--