Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 53704 invoked from network); 10 Jul 2010 22:23:20 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 10 Jul 2010 22:23:20 -0000 Received: (qmail 75964 invoked by uid 500); 10 Jul 2010 22:23:19 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 75905 invoked by uid 500); 10 Jul 2010 22:23:18 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 75897 invoked by uid 99); 10 Jul 2010 22:23:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 10 Jul 2010 22:23:18 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.212.44] (HELO mail-vw0-f44.google.com) (209.85.212.44) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 10 Jul 2010 22:23:11 +0000 Received: by vws14 with SMTP id 14so3310549vws.31 for ; Sat, 10 Jul 2010 15:21:48 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.186.210 with SMTP id ct18mr6492370qcb.289.1278800508492; Sat, 10 Jul 2010 15:21:48 -0700 (PDT) Received: by 10.229.224.141 with HTTP; Sat, 10 Jul 2010 15:21:48 -0700 (PDT) In-Reply-To: <4C38C86B.5070004@cloudeventprocessing.com> References: <2B4B52565669304C979DC60E52AA3F9B022C8EF34B@spnvm1183.bud.bpa.gov> <4C38C86B.5070004@cloudeventprocessing.com> Date: Sat, 10 Jul 2010 15:21:48 -0700 Message-ID: Subject: Re: TechCrunch article on Twitter and Cassandra From: Benjamin Black To: user@cassandra.apache.org, colin@cloudeventprocessing.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org On Sat, Jul 10, 2010 at 12:22 PM, Colin Clark wrote: > > Although I'm a fan of Cassandra, there's no way I'd use it today for my t= ier > 1 deployments, because I don't have the resources of Facebook, and even > though Cassandra is open source, that doesn't mean I can fix it when it g= oes > down.=A0 And, because it's open source, there's no one to call to have it > fixed reliably and within production constraints.=A0 Cassandra's strength= is > its greatest weakness right now. > There are others, however, who do have the skills not just to fix it when it goes down, but to improve the code in a variety of ways and contribute that code back the the project. That you do not have those skills is a good indication you should stick to what you know, not an indictment of Cassandra (or any other non-SQL store). > The bloom is starting to come off NoSQL, which is normal - it means that > people & firms are trying to do more with it and most probably realizing > that all of the tools, support, infrastructure, etc. surrounding alternat= ive > solutions isn't such a bad thing.=A0 And that the world of NoSQL had star= t to > come up with a better mantra than "joins are bad, dude", and "you're just > protecting the status quo."=A0 There's a *lot more* big data wrapped up i= nside > of SQL databases and only a fraction of the in NoSQL - and there's a lot = of > reasons for it. > You are, for whatever reason, using the dullest of cliches as if they were informed opinion. Nobody with actual knowledge of the space says "joins are bad, dude". What they might say is "When you have petabytes and low latency requirements, joins are an expensive proposition". That is clearly a true statement and constructing indices in a column store to avoid joins is a reasonable decision to avoid that expense. Is it free? Of course not, nothing is. > For example, do I *really* need Cassandra if MySQL will work for me and I > just want to get up and running quickly without writing a bunch of code?= =A0 My > team was pushing greater than 20k updates per second into, GASP, Oracle 5 > years ago.=A0 Sure, it was expensive.=A0 But it worked.=A0 And it was wor= th it - > or we wouldn't have spent the $$.=A0 What's your data worth if you don't = have > your data? zero. > Had you spent any time on the irc channel you would've seen this advice given repeatedly. If you don't need what Cassandra does, don't use it. That you have seen 20k updates/sec on really expensive hardware with a SQL store is neither surprising nor relevant. As you must realize, those choose to ignore, Cassandra is about more than just high, per-node write throughput. It is about seamless scale-out of a single cluster, robustness in the face of node failure and network partition, etc. Can you do that with a SQL store? Certainly. Expect to pay 5x in hardware and not be able to operate multi-DC. It's what folks call a trade-off. > And then there's support - internal support.=A0 Picking a database du-jou= r is > organizationally expensive.=A0 Especially when there's probably one or tw= o > databases that Twitter could have bought off the shelf that would have > solved their problems. You have no idea what their actual problems are and are merely engaging in the favorite game of HN and similar venues: armchair engineering. b