Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 76957 invoked from network); 16 Jul 2010 04:26:25 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 16 Jul 2010 04:26:25 -0000 Received: (qmail 93146 invoked by uid 500); 16 Jul 2010 04:26:24 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 92944 invoked by uid 500); 16 Jul 2010 04:26:21 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 92936 invoked by uid 99); 16 Jul 2010 04:26:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Jul 2010 04:26:20 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [38.102.63.181] (HELO smtp-2.01.com) (38.102.63.181) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Jul 2010 04:26:11 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp-2.01.com (Postfix) with ESMTP id 6F6A91AEA73 for ; Thu, 15 Jul 2010 23:25:50 -0500 (CDT) Received: from smtp-2.01.com ([127.0.0.1]) by localhost (smtp-2.01.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wiqxMH9UReR2 for ; Thu, 15 Jul 2010 23:25:50 -0500 (CDT) Received: by smtp-2.01.com (Postfix, from userid 99) id 4AC1C1AEA70; Thu, 15 Jul 2010 23:25:50 -0500 (CDT) Received: from [192.168.1.122] (cpe-72-177-112-232.austin.res.rr.com [72.177.112.232]) by smtp-2.01.com (Postfix) with ESMTPSA id 0030D1AD993 for ; Thu, 15 Jul 2010 23:25:49 -0500 (CDT) Message-ID: <4C3FDF48.2090101@fourkitchens.com> Date: Fri, 16 Jul 2010 04:25:44 +0000 From: David Strauss Organization: Four Kitchens User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1.10) Gecko/20100512 Thunderbird/3.0.5 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: A very short summary on Cassandra for a book References: In-Reply-To: X-Enigmail-Version: 1.0.1 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig759C70C892FAF2E663467E90" X-Virus-Checked: Checked by ClamAV on apache.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig759C70C892FAF2E663467E90 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 2010-07-16 01:57, Dave Viner wrote: > I am no expert... but parts seem accurate, parts not. >=20 > "Cassandra stores four or five dimension associated arrays" > not sure what you're counting as a dimension of the associated array, > but here are the 2 associative array-like syntaxes: >=20 > ColumnFamily[row-key][column-name] =3D value1 > ColumnFamily[row-key][super-column-name][column-name] =3D value2 You're forgetting the first dimension: the keyspace. However, that dimension is mostly a scope for configuration and administration, just like MySQL "databases" on a single MySQL instance. > "The first dimension is fixed on creation of the database but the > rest can be infinitely large" > I don't understand this sentence. The definition of a ColumnFamily is > set by the configuration file (storage-conf.xml). If you change it, an= d > restart a node, that node will use the new definition of the CF. For a book, I would avoid pinning down what's dynamic at runtime and what's fixed at startup because that's changing rapidly with upcoming versions. Cassandra 0.7 features dynamic keyspace and column family creation, and its release is going to happen well before the end of 2010.= Even now, it's possible to modify most configurations with no disruption via a rolling cluster restart. > It is true that the number of columns can be large. I have no idea if > it's actually infinite - but more or less. There is no hard cap on the number of columns in a row. Real-world systems are known to comfortably scale to millions of columns per row. In current Cassandra releases, however, each super-column must fit into memory. This is because the current architecture treats super-columns and columns very similarly. While it's planned to change this for future releases, there's interest in a broader overhaul allowing arbitrary dimensionality; I wouldn't count on any change soon. Also -- and this isn't much of a restriction -- each row must fit on a single node's disk. > Also, it's probably not precise to call it a database, since that tends= > to invoke images of things like MySQL, Oracle, Postgres, etc. =20 Those are *relational* databases. Historically, "database" has been a general term for persistent data stores. > "Inserts are super fast and can happen to any > database server in the cluster." > Yes, this is true. Not 100% true. The sharding/partitioning mechanism in Cassandra assigns each row to at least one server in the cluster (more if the replication level is higher than one). It's possible to "write" to any server in the cluster, but the write will only complete once confirmed on an appropriate number of nodes (based on ConsistencyLevel). ConsistencyLevel.ZERO is a special exception that allows nearly blind writes to any node in the cluster, asynchronously replicating the data to the proper nodes, but most applications use at least ConsistencyLevel.ONE for any serious writes. The replication topology also affects write latency. Using a RackAware approach, Cassandra will often require a confirmed write at a remote location. Cassandra intentionally allows applications to dynamically decide read and write latency tradeoffs against consistency guarantees. So, I'd say writes in Cassandra are "as fast as your consistency and durability requirements allow." > "However, the system is append only there so there is no in-place updat= e > operation like increment" > The first part is not quite true. There is appending, but there is no > increment that's guaranteed universal. Cassandra is "eventually > consistent". So atomic increment doesn't really work in the "eventual"= > world. But, more precisely, one can add, update, change, modify, delet= e > rows, columns, and values at any time from any node. The lack of increment support has little to do with eventual consistency and everything to do with timestamp-based conflict resolution. With vector clocks (likely landing in 0.7 as a result of Digg's work), it will be possible to support increment and decrement operations, just not ones that give you an instant, unique result. The actual inc and dec support probably won't be in 0.7, though. > "Also sorting happens on insert time" > Yes, I believe this is true. Basically true. I could nitpick, but it wouldn't add much clarity to the discussion. --=20 David Strauss | david@fourkitchens.com | +1 512 577 5827 [mobile] Four Kitchens | http://fourkitchens.com | +1 512 454 6659 [office] | +1 512 870 8453 [direct] --------------enig759C70C892FAF2E663467E90 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkw/300ACgkQZ52GCE5ilTOK+ACeLrS2nxfZGpg30CGtndkvVDng GGkAn32FN66CkNAw7T2fiJRqkyLlyQRm =BxTf -----END PGP SIGNATURE----- --------------enig759C70C892FAF2E663467E90--