incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Effective allocation of multiple disks
Date Wed, 10 Mar 2010 17:09:54 GMT
Thanks for testing that, added a note to
http://wiki.apache.org/cassandra/CassandraHardware on stripe size.

On Wed, Mar 10, 2010 at 11:03 AM, B. Todd Burruss <bburruss@real.com> wrote:
> with the file sizes we're talking about with cassandra and other database
> products, the stripe size doesn't seem to matter.  i suppose there may be a
> modicum of overhead with a small stripe size, but i'm not sure.  mine is set
> to 128k, which produced the same results as 16k and 256k.
>
> i will say the number of drives within the RAID 0 setup does seem to matter.
>  more you have the more parallelism you can get with a good RAID controller.
>
> Eric Rosenberry wrote:
>>
>> Based on the documentation, it is clear that with Cassandra you want to
>> have one disk for commitlog, and one disk for data.
>>
>> My question is: If you think your workload is going to require more io
>> performance to the data disks than a single disk can handle, how would you
>> recommend effectively utilizing additional disks?
>>
>> It would seem a number of vendors sell 1U boxes with four 3.5 inch disks.
>>  If we use one for commitlog, is there a way to have Cassandra itself
>> equally split data across the three remaining disks?  Or is this something
>> that needs to be handled by the hardware level, or operating system/file
>> system level?
>>
>> Options include a hardware RAID controller in a RAID 0 stripe (this is
>> more $$$ and for what gain?), or utilizing a volume manager like LVM.
>>
>> Along those same lines, if you do implement some type of striping, what
>> RAID stripe size is recommended?  (I think Todd Burruss asked this earlier
>> but I did not see a response)
>>
>> Thanks for any input!
>>
>> -Eric
>

Mime
View raw message