cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Black...@b3k.us>
Subject Re: TechCrunch article on Twitter and Cassandra
Date Sat, 10 Jul 2010 22:21:48 GMT
On Sat, Jul 10, 2010 at 12:22 PM, Colin Clark
<colin@cloudeventprocessing.com> wrote:
>
> Although I'm a fan of Cassandra, there's no way I'd use it today for my tier
> 1 deployments, because I don't have the resources of Facebook, and even
> though Cassandra is open source, that doesn't mean I can fix it when it goes
> down.  And, because it's open source, there's no one to call to have it
> fixed reliably and within production constraints.  Cassandra's strength is
> its greatest weakness right now.
>

There are others, however, who do have the skills not just to fix it
when it goes down, but to improve the code in a variety of ways and
contribute that code back the the project.  That you do not have those
skills is a good indication you should stick to what you know, not an
indictment of Cassandra (or any other non-SQL store).

> The bloom is starting to come off NoSQL, which is normal - it means that
> people & firms are trying to do more with it and most probably realizing
> that all of the tools, support, infrastructure, etc. surrounding alternative
> solutions isn't such a bad thing.  And that the world of NoSQL had start to
> come up with a better mantra than "joins are bad, dude", and "you're just
> protecting the status quo."  There's a *lot more* big data wrapped up inside
> of SQL databases and only a fraction of the in NoSQL - and there's a lot of
> reasons for it.
>

You are, for whatever reason, using the dullest of cliches as if they
were informed opinion.  Nobody with actual knowledge of the space says
"joins are bad, dude".  What they might say is "When you have
petabytes and low latency requirements, joins are an expensive
proposition".  That is clearly a true statement and constructing
indices in a column store to avoid joins is a reasonable decision to
avoid that expense.  Is it free?  Of course not, nothing is.

> For example, do I *really* need Cassandra if MySQL will work for me and I
> just want to get up and running quickly without writing a bunch of code?  My
> team was pushing greater than 20k updates per second into, GASP, Oracle 5
> years ago.  Sure, it was expensive.  But it worked.  And it was worth it -
> or we wouldn't have spent the $$.  What's your data worth if you don't have
> your data? zero.
>

Had you spent any time on the irc channel you would've seen this
advice given repeatedly.  If you don't need what Cassandra does, don't
use it.  That you have seen 20k updates/sec on really expensive
hardware with a SQL store is neither surprising nor relevant.  As you
must realize, those choose to ignore, Cassandra is about more than
just high, per-node write throughput.  It is about seamless scale-out
of a single cluster, robustness in the face of node failure and
network partition, etc.  Can you do that with a SQL store?  Certainly.
 Expect to pay 5x in hardware and not be able to operate multi-DC.
It's what folks call a trade-off.

> And then there's support - internal support.  Picking a database du-jour is
> organizationally expensive.  Especially when there's probably one or two
> databases that Twitter could have bought off the shelf that would have
> solved their problems.

You have no idea what their actual problems are and are merely
engaging in the favorite game of HN and similar venues: armchair
engineering.


b

Mime
View raw message