Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AD815DC50 for ; Sat, 11 Aug 2012 18:32:05 +0000 (UTC) Received: (qmail 45194 invoked by uid 500); 11 Aug 2012 18:32:03 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 45163 invoked by uid 500); 11 Aug 2012 18:32:03 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 45154 invoked by uid 99); 11 Aug 2012 18:32:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 11 Aug 2012 18:32:03 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of edlinuxguru@gmail.com designates 209.85.213.172 as permitted sender) Received: from [209.85.213.172] (HELO mail-yx0-f172.google.com) (209.85.213.172) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 11 Aug 2012 18:31:58 +0000 Received: by yenm5 with SMTP id m5so2657132yen.31 for ; Sat, 11 Aug 2012 11:31:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=K9E8V5gTCtDFDtJQkQwkXbBTG1yZMclNCXBFypxNjzE=; b=XspujJ/24TBAe0UCuTrMsohzaF7NVHg0Q9xkw3A7hhYyaybIdqt2rYMUGKVJDskSmX 3qfO542HUa4C+0H8x/TIPLQHXTuYU/rkHCNk6pbkEX+szkpL0kmDiIvW7aCGrQxMkCGj D5USECgniYVxXvxia/MGJFbZbKSu7OHgghVN/50VB6IjhZBEFEVJ6ijV1aUrH7Dk9CPv 8gMuUZ9Dn+xyA66IId3ikvcX3S4CYNKtrEpnJbvf+sQ5Hr2O8SLhduhz+FOmOFmuVNLj iFleOeZQuFgC6ix8hHqavO8q6jlZhznw9DkZ06RFyZy8nPPkqgEoGK/jGBZQBKGjJ3WV y8Gg== MIME-Version: 1.0 Received: by 10.50.196.135 with SMTP id im7mr1058128igc.14.1344709897589; Sat, 11 Aug 2012 11:31:37 -0700 (PDT) Received: by 10.64.86.227 with HTTP; Sat, 11 Aug 2012 11:31:37 -0700 (PDT) In-Reply-To: References: <3BC67EDA-4C35-410A-94CF-CC49FD962114@gmail.com> Date: Sat, 11 Aug 2012 14:31:37 -0400 Message-ID: Subject: Re: quick question about data layout on disk From: Edward Capriolo To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Aaron, I have not deep dived the data files in a while but this is how I understand it. http://wiki.apache.org/cassandra/ArchitectureSSTable There is no need to store the row key each time with the column. RowKey to columns is a one to many relationship. This would be a diagram of a physical file: Hbase does it like this i guess (Or it used to I am not up on the news): rowkey1,column1,value1 rowkey1,column2,value2 First, I believe they repeat the row key for each column, which is not a huge deal because you should always use compression but it is a bit wasteful especially for a non-compressed table. I know this has some impact on very wide rows because a single rowkey must fit inside this structure of an hfile. ? (again its been a while) But to get back to your question. In cassandra: sstable1 rowkey1: numberof columns 26 (column1,value1,ts1) .... (column26,value26,ts26) sstable2 rowkey1: numberof columns 1 (column1,value1,ts2) The rowkey appears once in a given sstable if the row has 1 or more columns in the sstable. On the read path Cassandra searches all sstables find all the columns for a row (bloom filters and other criteria eliminate some sstables from read path). It then merges the row factoring in tombstones and the last update win rules for a column. On Sat, Aug 11, 2012 at 2:03 PM, Aaron Turner wrote: > So how does that work? An sstable is for a single CF, but it can and > likely will have multiple rows. There is no read to write and as I > understand it, writes are append operations. > > So if you have an sstable with say 26 different rows (A-Z) already in > it with a bunch of columns and you add a new column to row J, how does > Cassandra store the column/value pair on disk in a way to refer to row > J without re-writing the row key or some representation of it? > > Thanks, > Aaron > > On Fri, Aug 10, 2012 at 7:53 PM, Terje Marthinussen > wrote: >> Rowkey is stored only once in any sstable file. >> >> That is, in the spesial case where you get sstable file per column/value, you are correct, but normally, I guess most of us are storing more per key. >> >> Regards, >> Terje >> >> On 11 Aug 2012, at 10:34, Aaron Turner wrote: >> >>> Curious, but does cassandra store the rowkey along with every >>> column/value pair on disk (pre-compaction) like Hbase does? If so >>> (which makes the most sense), I assume that's something that is >>> optimized during compaction? >>> >>> >>> -- >>> Aaron Turner >>> http://synfin.net/ Twitter: @synfinatic >>> http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows >>> Those who would give up essential Liberty, to purchase a little temporary >>> Safety, deserve neither Liberty nor Safety. >>> -- Benjamin Franklin >>> "carpe diem quam minimum credula postero" > > > > -- > Aaron Turner > http://synfin.net/ Twitter: @synfinatic > http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows > Those who would give up essential Liberty, to purchase a little temporary > Safety, deserve neither Liberty nor Safety. > -- Benjamin Franklin > "carpe diem quam minimum credula postero"