Return-Path: Delivered-To: apmail-incubator-cassandra-dev-archive@minotaur.apache.org Received: (qmail 41534 invoked from network); 29 Oct 2009 22:00:04 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 29 Oct 2009 22:00:04 -0000 Received: (qmail 81469 invoked by uid 500); 29 Oct 2009 22:00:03 -0000 Delivered-To: apmail-incubator-cassandra-dev-archive@incubator.apache.org Received: (qmail 81430 invoked by uid 500); 29 Oct 2009 22:00:03 -0000 Mailing-List: contact cassandra-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-dev@incubator.apache.org Delivered-To: mailing list cassandra-dev@incubator.apache.org Received: (qmail 81165 invoked by uid 99); 29 Oct 2009 22:00:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Oct 2009 22:00:03 +0000 X-ASF-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates 209.85.219.212 as permitted sender) Received: from [209.85.219.212] (HELO mail-ew0-f212.google.com) (209.85.219.212) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Oct 2009 21:59:59 +0000 Received: by ewy8 with SMTP id 8so503374ewy.12 for ; Thu, 29 Oct 2009 14:59:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=adeGUR+gzYg5AeUXVIVDAopojidfudTohgw/BmZ7n2Q=; b=jrjjHhkJkR/NlTu+GFBDFFfWtv6jgvJmpemm7CcYTLlZ47LCX8Nu6YGphBwU0dZg6f DtK5uIBcMmqlQ11QxmF914zNB3HU0yRwDh/ny4OycXHMDHQCv1agKg03VWQlxMuyIxmp XTfVYPBDvJX6vkVpqJzGfNIi4JhN5Rt54tbuY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=tIfv1QreevTdtKyLYLRJ9ESl7wWLK+6DwONObItrw5anztGDvtye6uQGJDPxcVFQWg 5CwAKXMJCdU+dbX8tRSWo7+urvWxtafs14yu2oN1lRM9sXsZcsGWRtg/UIfoN4RMdlpX 8FYxiDlVZkffNR+h9TkzfZZVss/lGpQVvgXOU= MIME-Version: 1.0 Received: by 10.216.89.11 with SMTP id b11mr237397wef.171.1256853578088; Thu, 29 Oct 2009 14:59:38 -0700 (PDT) In-Reply-To: <35bb42690910291420m2a45ca6fn4caf0e5547cdbc53@mail.gmail.com> References: <860544ed0910291348t75fbc295v485207768e1c3346@mail.gmail.com> <35bb42690910291420m2a45ca6fn4caf0e5547cdbc53@mail.gmail.com> From: Jonathan Ellis Date: Thu, 29 Oct 2009 15:59:18 -0600 Message-ID: Subject: Re: HBase vs. Cassandra: new article! To: cassandra-user@incubator.apache.org, chris@chriswere.com Cc: cassandra-dev@incubator.apache.org Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Okay, here are some corrections. It's a bit choppy because it's just that; a list of corrections. Again, this is just trying to address factual errors; I disagree with many of the expressed opinions, too. :) > Cassandra relies mostly on Key-Value pairs for storage No more than hbase does. Cassandra's columnfamily model does away with historical values, and adds supercolumns, but the two have a lot more in commmon with each other than with actual k/v stores. > it=92s a fact that far more people are using HBase than Cassandra at this= moment While it's possible that more people are using HBase right now, with 90 people in the cassandra irc chanel, and 55 in hbase, I'm comfortable that Cassandra's community is healthy. > despite both being similarly recent HBase is roughly 2x as old as Cassandra. > HBase values strong consistency and High Availability while Cassandra val= ues Availability and Partitioning tolerance HBase actually picks CP. > Efficiently running MapReduce on Cassandra, on the other hand, is difficu= lt because all of its keys are in one big =93space=94, so the MapReduce fra= mework doesn=92t know how to split and divide the data natively. There need= s to be some hackery in place to handle all of that. Writing a hadoop input generator is a Feature, to use the article's terminology. It doesn't have to be hackish; in fact, trunk now has a key range splitter that could easily be adapted to Hadoop. Quoting an old patchset to "prove" that cassandra can only poorly interface to hadoop is weak. > Cassandra is only a Ruby gem install away. Or a tar download, or a deb package... > You still have to do quite a bit of manual configuration Other than columnfamily definition (which must also be done for hbase), I'm not sure what the author was thinking of here. bin/cassandra works out of the box, and (unlike hbase) there is only one type of process to deal with, which is a huge win for ops in production. > in HBase, if a region server is down, writes will be blocked for affected= data until the data is redistributed (that is why hbase really has CP out of CAP, not CA) > Cassandra, however, has an internal method of resolving up-to-dateness is= sues with vector clocks =97 a complex but workable solution where basically= the latest timestamp wins No; Cassandra uses latest-timestamp-wins, which is totally different from vector clocks. > Another architectural quibble is that Cassandra only supports one table p= er install. That means you can=92t denormalize your data to make it more us= able in analytical scenarios. Not even a kernel of truth there. wtf? > Cassandra is really more of a Key Value store than a Data Warehouse. Again: wtf? > Furthermore, schema changes require a cluster restart This part is true, for now. But, misleading since "schema change" means "adding CFs or keyspaces," not merely "modifying columns" like in traditional dbs. > it=92s difficult to claim that Cassandra implements the BigTable model We never claimed to be a pure bigtable clone. We don't want to be, because of the single points of failures and operational complexity involved. > Cassandra is optimized for small datacenters (hundreds of nodes) connecte= d by very fast fiber. HBase, being based on research originally published b= y Google, is happy to handle replication to thousands of planet-strewn node= s across the =92slow=92, unpredictable Internet Cassandra has multi-datacenter support already. HBase didn't, last I checked. So this is weird. > This first diagram is a model of the Cassandra replication scheme. Note that all these steps happen in parallel. > it=92s impossible to tell when the required number of replicas will be up= -to-date. This can be extremely painful in a live situation =97 when one of= your DCs goes down, you often want to know *exactly* when to expect data c= onsistency Cassandra provides consistency when R + W > N (read replica count + write replica count > replication factor). If you do writes and reads both with QUORUM, for one example, you can expect data consistency as soon as there are enough nodes for a quorum (which may not even require the DC to be online). That is not "impossible to tell" at all. > It=92s important to note that Cassandra relies on high-speed fiber betwee= n datacenters. Simply flat-out wrong. > If your writes are taking 1 or 2 ms, that=92s fine. But when a DC goes ou= t and you have to revert to a secondary one in China instead of 20 miles aw= ay, the incredible latency will lead to write timeouts and highly inconsist= ent data. Sure, "incredible" latency of 100ms or so is bad, but it's not the end of the world, and won't cause either write timeouts or inconsistent data, assuming that you are in fact using R + W > N.