Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 83229 invoked from network); 10 May 2010 20:24:09 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 10 May 2010 20:24:09 -0000 Received: (qmail 79691 invoked by uid 500); 10 May 2010 20:24:08 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 79651 invoked by uid 500); 10 May 2010 20:24:08 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 79643 invoked by uid 99); 10 May 2010 20:24:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 May 2010 20:24:08 +0000 X-ASF-Spam-Status: No, hits=2.9 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.212.172] (HELO mail-px0-f172.google.com) (209.85.212.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 May 2010 20:24:00 +0000 Received: by pxi19 with SMTP id 19so2045964pxi.31 for ; Mon, 10 May 2010 13:23:38 -0700 (PDT) Received: by 10.114.188.9 with SMTP id l9mr3599689waf.175.1273523018391; Mon, 10 May 2010 13:23:38 -0700 (PDT) Received: from [192.168.1.106] ([67.188.70.149]) by mx.google.com with ESMTPS id g30sm28573214wag.16.2010.05.10.13.23.36 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 10 May 2010 13:23:37 -0700 (PDT) From: Peter Hsu Mime-Version: 1.0 (Apple Message framework v1078) Content-Type: multipart/alternative; boundary=Apple-Mail-48-1010555834 Subject: Re: Human readable Cassandra limitations Date: Mon, 10 May 2010 13:23:35 -0700 In-Reply-To: To: user@cassandra.apache.org References: Message-Id: X-Mailer: Apple Mail (2.1078) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-48-1010555834 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Thanks for the response, Paul. Very helpful, but very general at the same time. I'm still having = trouble translating these into actual use cases. Let me think of some = better questions before I continue the thread, but I'd like to address = one of the weaknesses you brought up: > * Cassandra and its siblings are weak at ad hoc queries on tables > that you did not think to index in advance What is the normal way of dealing with this in Cassandra? Would you = just create a new "index" and bring a big honking machine to the table = to process all the existing data in the database and store the new = "index"? On May 10, 2010, at 11:22 AM, Paul Prescod wrote: > This is a very, very big topic. For the most part, the issues are > covered in the various SQL versus NoSQL debates all over the Internet. > For example: >=20 > * Cassandra and its NoSQL siblings have no concept of an in-database = "join" >=20 > * Cassandra and its NoSQL siblings do not allow you to update > multiple "tables" in a single transactions >=20 > * Cassandra's API is specific to it, and not portable to any other = data store >=20 > * Cassandra currently has simplistic facilities to deal with various > kinds of conflicting write. >=20 > * Cassandra is strongly optimized for multiple machine distributions, > whereas relational databases tend to be optimized for a single > powerful machine. >=20 > * Cassandra and its siblings are weak at ad hoc queries on tables > that you did not think to index in advance >=20 > On Mon, May 10, 2010 at 11:06 AM, Peter Hsu = wrote: >> I've seen a lot of threads and posts about why Cassandra is great. = I'm fairly sold on the features, and the few big deployments on = Cassandra give it a lot of credibility. >>=20 >> However, I don't believe in magic bullets, so I really want to = understand the potential downsides of Cassandra. Right now, I don't = really have a clue as to what Cassandra is bad at. I took a look at = http://wiki.apache.org/cassandra/CassandraLimitations which is helpful, = but doesn't characterize its weaknesses in ways that I can really = comprehend until I've actually used Cassandra and understand some of the = internals. It seems that the community would benefit from being able to = answer some of these questions in terms of real world use cases. >>=20 >> My main questions: >> * Are there designs in which a SQL database out-performs or = out-scales Cassandra? >> * Is there a pros vs cons page of Cassandra against an open source = SQL database (MySQL or Postgres)? >>=20 >> I do plan on attending the training session next Friday in Palo Alto, = but it'd be great if I had some more food for thought before I attend. >>=20 >>=20 --Apple-Mail-48-1010555834 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii
* = Cassandra and its siblings are weak at ad hoc queries on tables
that = you did not think to index in = advance

What is the normal way of = dealing with this in Cassandra?  Would you just create a new = "index" and bring a big honking machine to the table to process all the = existing data in the database and store the new = "index"?

On May 10, 2010, at 11:22 = AM, Paul Prescod wrote:

This = is a very, very big topic. For the most part, the issues are
covered = in the various SQL versus NoSQL debates all over the Internet.
For = example:

* Cassandra and its NoSQL siblings have no concept of = an in-database "join"

* Cassandra and its NoSQL siblings do not = allow you to update
multiple "tables" in a single = transactions

* Cassandra's API is specific to it, and not = portable to any other data store

* Cassandra currently has = simplistic facilities to deal with various
kinds of conflicting = write.

* Cassandra is strongly optimized for multiple machine = distributions,
whereas relational databases tend to be optimized for = a single
powerful machine.

* Cassandra and its siblings are = weak at ad hoc queries on tables
that you did not think to index in = advance

On Mon, May 10, 2010 at 11:06 AM, Peter Hsu <peter@motivecast.com> = wrote:
I've seen a lot of threads and posts = about why Cassandra is great.  I'm fairly sold on the features, and = the few big deployments on Cassandra give it a lot of = credibility.

However, I = don't believe in magic bullets, so I really want to understand the = potential downsides of Cassandra.  Right now, I don't really have a = clue as to what Cassandra is bad at.  I took a look at http://wiki= .apache.org/cassandra/CassandraLimitations which is helpful, but = doesn't characterize its weaknesses in ways that I can really comprehend = until I've actually used Cassandra and understand some of the internals. =  It seems that the community would benefit from being able to = answer some of these questions in terms of real world use = cases.

My main = questions:
 * Are there = designs in which a SQL database out-performs or out-scales = Cassandra?
 * Is there a = pros vs cons page of Cassandra against an open source SQL database = (MySQL or Postgres)?

I do plan on = attending the training session next Friday in Palo Alto, but it'd be = great if I had some more food for thought before I = attend.



<= /html>= --Apple-Mail-48-1010555834--