cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhong Li <...@voxeo.com>
Subject Re: Cassandra performance
Date Fri, 17 Sep 2010 21:35:28 GMT
This is my personal experiences. MySQL is faster than Cassandra on  
most normal use cases.

You should understand why you choose Cassandra instead of MySQL. If  
one central MySQL can handle your workload, MySQL is better than  
Cassandra. BUT if you are overload one MySQL and want multiple boxes,  
Cassandra can be a solution for cheap, Cassandra  provides fault  
tolerant, decentralized, durable and rich data model. It will not  
provide your high performance, especially reading  performance is poor.

Digg failed to use Cassandra. You can check
http://techcrunch.com/2010/09/07/digg-struggles-vp-engineering-door/

This doesn't mean Cassandra is bad. You need design carefully to use  
Cassandra for your application and business model for success.



On Sep 15, 2010, at 12:06 PM, Wayne wrote:

> If MySQL is faster then use it. I struggled to do side by side  
> comparisons with Mysql for months until finally realizing they are  
> too different to do side by side comparisons. Mysql is always faster  
> out of the gate when you come at the problem thinking in terms of  
> relational databases. Add in replication factor, using wider rows,  
> dealing with databases that are 2-3 terabytes, tables with 3+  
> billions rows, etc. etc. The nosql "noise" out there should be  
> ignored, and a solution like cassandra should be evaluated for what  
> it brings to the table in terms of a technology that can solve the  
> problems of big data and not how it does individual queries relative  
> to mysql. If a "normal" database works for you use it!!
>
> We have tested real loads using a 6 node cluster and consistently  
> get 5ms reads under load. That is 200 reads/second (1 thread). Mysql  
> is 10x faster, but then we also have wide rows and in that 5ms get 6  
> months of lots of different time series data which in the end means  
> it is 10x faster than Mysql (1 thread). By embracing wide rows we  
> turn slower into faster. Add in multiple threads/processes and the  
> ability for a 20 node cluster to support concurrent reads and Mysql  
> falls back in the dust. Also we don't have 300gb compressed backup  
> files, we can easily add new nodes and grow, we can actually add  
> columns dynamically without the dreaded ddl deadlock nightmare in  
> mysql, and for once we have replication that just works.
>
>
> On Wed, Sep 15, 2010 at 2:39 AM, Oleg Anastasyev  
> <oleganas@gmail.com> wrote:
> Kamil Gorlo <kgs4242 <at> gmail.com> writes:
>
> >
> > So I've got more reads from single MySQL with 400GB of data than  
> from
> > 8 machines storing about 266GB. This doesn't look good. What am I
> > doing wrong? :)
>
> The worst case for cassandra is random reads. You should ask youself  
> a question,
> do you really have this kind of workload in production ? If you  
> really do, that
> means cassandra is not the right tool for the job. Some product  
> based on
> berkeley db should work better, e.g. voldemort. Just plain old  
> filesystem is
> also good for 100% random reads (if you dont need to backup of  
> course).
>
>


Mime
View raw message