SSD will not generally improve your write performance very much, but they can significantly improve read performance.

You do *not* want to waste an SSD on the commitlog drive, as even a slow HDD can write sequentially very quickly.  For the data drive, they might make sense.

As Jonathan talks about, it has a lot to do with your access patterns.  If you either: (1) delete parts of rows (2) update parts of rows, or (3) insert new columns into existing rows frequently, you'll end up with rows spread across several SSTables (which are on disk).  This means that each read may require several seeks, which are very slow for HDDs, but are very quick for SSDs.

Of course, the randomness of what rows you access is also important, but Jonathan did a good job of covering that.  Don't forget about the effects of caching here, too.

The only way to tell if it is cost-effective is to test your particular access patterns (using a configured stress.py test or, preferably, your actual application).

- Tyler

On Wed, Nov 3, 2010 at 3:44 PM, Jonathan Shook <jshook@gmail.com> wrote:
SSDs are not reliable after a (relatively-low compared to spinning
disk) number of writes.
They may significantly boost performance if used on the "journal"
storage, but will suffer short lifetimes for highly-random write
patterns.

In general, plan to replace them frequently. Whether they are worth
it, given the performance improvement over the cost of replacement x
hardware x logistics is generally a calculus problem. It's difficult
to make a generic rationale for or against them.

You might be better off in general by throwing more memory at your
servers, and isolating your random access from your journaled data.
Is there any pattern to your reads and writes/deletes? If it is fully
random across your keys, then you have the worst-case scenario.
Sometimes you can impose access patterns or structural patterns in
your app which make caching more effective.

Good questions to ask about your data access:
Is there a "user session" which shows an access pattern to proximal data?
Are there sets of access which always happen close together?
Are there keys or maps which add extra indirection?

I'm not familiar with your situation. I was just providing some general ideas..

Jonathan Shook

On Wed, Nov 3, 2010 at 2:32 PM, Alaa Zubaidi <alaa.zubaidi@pdf.com> wrote:
> Hi,
> we have a continuous high throughput writes, read and delete, and we are
> trying to find the best hardware.
> Is using SSD for Cassandra improves performance? Did any one compare SSD vs.
> HDD? and any recommendations on SSDs?
>
> Thanks,
> Alaa
>
>