Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 94071 invoked from network); 5 Jun 2010 22:33:53 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 5 Jun 2010 22:33:53 -0000 Received: (qmail 6473 invoked by uid 500); 5 Jun 2010 22:33:52 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 6450 invoked by uid 500); 5 Jun 2010 22:33:52 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 6442 invoked by uid 99); 5 Jun 2010 22:33:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 05 Jun 2010 22:33:52 +0000 X-ASF-Spam-Status: No, hits=0.3 required=10.0 tests=AWL,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jshook@gmail.com designates 209.85.212.44 as permitted sender) Received: from [209.85.212.44] (HELO mail-vw0-f44.google.com) (209.85.212.44) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 05 Jun 2010 22:33:47 +0000 Received: by vws19 with SMTP id 19so247483vws.31 for ; Sat, 05 Jun 2010 15:33:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=aP9Lf3dLG0tq5zA+c1Xc0bUGmvHE9X32S24TZIaqMCQ=; b=rTVQjvCltVhN5UDi8EApCgUFLDQlGXU0HxUrnmj3CDxsTz++WCr/WO+bABsIWltIuc 2btIPycqIyYKGKoBiF/8vzyxVEROEz7pQgfbZq/iBsezbEMuGgS1t3NrHuS/u/tpwJD6 mYGyfV6K7TLCdhrNBUT+3AV1KGGUDS9MaTyjM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=kMRq1a3E1UdGNOiXnZJJ+4FlrF59wYE3+2RvEMXF+u5qQzE3GqBvtJcfC2E+kHsKZe A2ivL6FMK438cHtWvFre7a5+aYd0LGUHXlUlbwkE0LElFSkeg5pb80udtgBMwinQEgIm WpbGHC6VXlW0AQYsdlsFLW/PSrWuaI2bBYGGo= MIME-Version: 1.0 Received: by 10.229.181.21 with SMTP id bw21mr3522515qcb.117.1275777204541; Sat, 05 Jun 2010 15:33:24 -0700 (PDT) Received: by 10.229.95.132 with HTTP; Sat, 5 Jun 2010 15:33:24 -0700 (PDT) In-Reply-To: References: Date: Sat, 5 Jun 2010 17:33:24 -0500 Message-ID: Subject: Re: Conditional get From: Jonathan Shook To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable It sounds like you are getting a handle on it, but maybe in a round-about w= ay. Here are some ways I like of conceptualizing Cassandra. Maybe they can shorten your walk. Either the grid analogy or the maps-of-maps analogy can apply, as they both map conceptually to the way that we use a column family. -- The maps-of-maps analogy: Please try to think of the "column" as the intersection between a row key and a column name. This captures the most essential concepts. It's easier for me to think of in terms of a sorted map to a sorted map, wh= ere: * the outer map is the set of rows whose whose (map) keys and (map) values are (Cassandra) keys and (Cassandra) rows * the inner map for each row key is the set of columns whose keys and values are column names and column data. * column data is essentially a molecule of (column name, column value, storage timestamp). It can be thought of as the "value", but it is stored as a 3-tuple. -- The grid analogy: (This one is my favorite) In the grid analogy, rows may be undefined. Rows that are defined may have columns that are undefined. Two things to think about when using this analogy: Cassandra doesn't have to store undefined values, except during deletes and before anti-entropy takes them away. Cassandra operates behind the scenes in row-major order. That means that while you can think of it terms of a Cartesian intersection, you should know that rows will always be accessed first. --=20 Another layer outward is the column family, which is also a map. Another layer inward is the sub-column, which is also a map. Don't get confused by super columns or sub columns. Super/Sub columns are really API sugar to reduce some of the work of using your own serialized aggregates within a normal column value. I find that the confusion is usually not worth the trouble when starting out. On the other hand, were you to implement your own aggregate types within a column value, the purpose of super/sub columns would seem obvious. It's just a little overly complex because of the supporting types in the API. Since this was basically bolted on to the standard column support, it falls into normal column behavior to the core Cassandra machinery. Neither the column family layer, nor the subcolumn layer have been given the same attention as the basic row->column with respect to performance and scalability. This may change in the future. For now, consider that only row-keys and column-names are places where Cassandra is able to scale the best. Jonathan On Sat, Jun 5, 2010 at 4:06 PM, Peter Schuller wrote: >> Eric wrote a good explanation with sample code at >> http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/ > > Regarding the schema description and analogy problem mentioned in the > article; I found that reading the BigTable paper helped a lot for me. > It seemed very useful to me to think of a ColumnFamily in Cassandra as > a sorted (on keys) on-disk table of entries with efficiency guarantees > with respect to range queries and locality on disk. > > Please correct me if I am wrong, but the data model as I now > understand it essentially boils down to a sorted table of the form > (readers who don't know the answer, please don't assume I'm right > unless someone in the know confirms it; I don't want to add to the > confusion): > > =A0rowkeyN+0,columnM+0 data > =A0rowkeyN+0,columnM+1 data > =A0... > =A0rowkeyN+1 data > =A0rowkeyN+2 data > =A0... > > Where each piece of "data" is is the column (I am ignoring super > columns for now). > > The table, other than being sorted, is indexed on row key and column name= . > > Is this correct? > > In my head I think of it as there being some N amount of "keys" (not > the cassandra term) that are interesting to the application, which end > up mapping to the actual "key" (not the cassandra term) in the table. > So, in a column family "users", we might have a "john doe" whose age > is "47". This means we have a "key" (not the cassandra term) which is > "users,john doe,age" and whose value is "47" (ignoring time stamps and > ignoring keys that contain commas, and ignoring column names being > semantically part of the data). > > So, given: > > =A0 =A0 =A0 users,john doe,age > > We have, in cassandra terms: > > =A0column family: users > =A0key: john doe > =A0column name: age > > The fact that different column families are in different files, to me, > seems mostly to be an implementation details since performance > characteristics (sorting, locality on disk) should be the same as it > had been if it was just one huge table (ignoring compactation > concerns, etc). > > The API exposed by cassandra is not one of a generalized multi-level > key, but rather one with specific concepts of ColumnFamily, Column and > SuperColumn. These essentially provides a two-level key (in the case > of a CF with C:s) and a three-level key (in the case of a CF with SC:s > with C:s), with the caveat that three-level keys are still only > indexed on their first two components (even though they are still > sorted on disk). > > Does this make sense at all? Provided that I have not misunderstood > the model completely and am completely wrong, I find this a much more > natural way to think of the underlying storage semantics. > > -- > / Peter Schuller >