We ran a test a few months back and know it's better to use RAID0 vs. just one mount directory. It slips my mind whether the test was also run for multiple directories vs. RAID0 as well.

We chose the RAID0 method for the AMI as to avoid confusion and allow for all the sstables to be on one drive and easier to find. This is also in line with many setups that we see in the wild.


btw, what is the performance difference between doing a raid0 on the
multiple ephemeral drives available, and then assign it to cassandra
data directory, vs creating a mount on each of these drives, and then
specify all of these to cassandra's data directory list?

since these drives are all virtual, would there be any benefit at all
in doing a raid0 ?


> Also, EBS volumes can be attached, but the performance issues cause other
> issues when running a healthy cluster. From experience running clusters on
> EBS volumes bring their own set of unique problems and are harder to debug.
> Here's a quick link that provides a bit more background information on why
> it's not the best fit for Cassandra.
>> AFAIK it's around 450G per ephemeral disk.
>> BTW randomly you can get high performance EBS drives as well. Performance
>> are good for DB but are random in IOps.
>> it seems that how many virtual disks you can have is fixed:
>> on m2.4xlarge you have 2 disks, while on m2.2xlarge you have only 1,
>> so I can't setup a raid0 on m2.2xlarge
>> am I correct?
