cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyler Hobbs <ty...@riptano.com>
Subject Re: SSD vs. HDD
Date Wed, 03 Nov 2010 20:58:57 GMT
SSD will not generally improve your write performance very much, but they
can significantly improve read performance.

You do *not* want to waste an SSD on the commitlog drive, as even a slow HDD
can write sequentially very quickly.  For the data drive, they might make
sense.

As Jonathan talks about, it has a lot to do with your access patterns.  If
you either: (1) delete parts of rows (2) update parts of rows, or (3) insert
new columns into existing rows frequently, you'll end up with rows spread
across several SSTables (which are on disk).  This means that each read may
require several seeks, which are very slow for HDDs, but are very quick for
SSDs.

Of course, the randomness of what rows you access is also important, but
Jonathan did a good job of covering that.  Don't forget about the effects of
caching here, too.

The only way to tell if it is cost-effective is to test your particular
access patterns (using a configured stress.py test or, preferably, your
actual application).

- Tyler

On Wed, Nov 3, 2010 at 3:44 PM, Jonathan Shook <jshook@gmail.com> wrote:

> SSDs are not reliable after a (relatively-low compared to spinning
> disk) number of writes.
> They may significantly boost performance if used on the "journal"
> storage, but will suffer short lifetimes for highly-random write
> patterns.
>
> In general, plan to replace them frequently. Whether they are worth
> it, given the performance improvement over the cost of replacement x
> hardware x logistics is generally a calculus problem. It's difficult
> to make a generic rationale for or against them.
>
> You might be better off in general by throwing more memory at your
> servers, and isolating your random access from your journaled data.
> Is there any pattern to your reads and writes/deletes? If it is fully
> random across your keys, then you have the worst-case scenario.
> Sometimes you can impose access patterns or structural patterns in
> your app which make caching more effective.
>
> Good questions to ask about your data access:
> Is there a "user session" which shows an access pattern to proximal data?
> Are there sets of access which always happen close together?
> Are there keys or maps which add extra indirection?
>
> I'm not familiar with your situation. I was just providing some general
> ideas..
>
> Jonathan Shook
>
> On Wed, Nov 3, 2010 at 2:32 PM, Alaa Zubaidi <alaa.zubaidi@pdf.com> wrote:
> > Hi,
> > we have a continuous high throughput writes, read and delete, and we are
> > trying to find the best hardware.
> > Is using SSD for Cassandra improves performance? Did any one compare SSD
> vs.
> > HDD? and any recommendations on SSDs?
> >
> > Thanks,
> > Alaa
> >
> >
>

Mime
View raw message