incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Rosenberry <epros...@gmail.com>
Subject Re: Effective allocation of multiple disks
Date Wed, 10 Mar 2010 08:19:37 GMT
Ahh, thanks!  I had read that, but I had assumed the reference to "use one
or more devices for DataFileDirectories" was referring to somehow making
multiple physical devices into one logical device via some underlying RAID
system.

So then as far as free space on the disks go, I have seen references to
keeping utilization below 50% to handle compaction.  Would it not be true to
say that you only need as much free space as the to handle another copy of
the largest data file you have?  (i.e. perhaps less than 50% of the disk)

Due to the compaction space requirement, would it be more efficient to do
RAID 0 somewhere under the hood?

Just simply being able to specify multiple DataFileDirectories does does
indeed sound appealing...

Thanks.

-Eric

On Wed, Mar 10, 2010 at 12:08 AM, Stu Hood <stu.hood@rackspace.com> wrote:

> You can list multiple DataFileDirectories, and Cassandra will scatter files
> across all of them. Use 1 disk for the commitlog, and 3 disks for data
> directories.
>
> See http://wiki.apache.org/cassandra/CassandraHardware#Disk
>
> Thanks,
> Stu
>
> -----Original Message-----
> From: "Eric Rosenberry" <eprosenx@gmail.com>
> Sent: Wednesday, March 10, 2010 2:00am
> To: cassandra-user@incubator.apache.org
> Subject: Effective allocation of multiple disks
>
> Based on the documentation, it is clear that with Cassandra you want to
> have
> one disk for commitlog, and one disk for data.
>
> My question is: If you think your workload is going to require more io
> performance to the data disks than a single disk can handle, how would you
> recommend effectively utilizing additional disks?
>
> It would seem a number of vendors sell 1U boxes with four 3.5 inch disks.
>  If we use one for commitlog, is there a way to have Cassandra itself
> equally split data across the three remaining disks?  Or is this something
> that needs to be handled by the hardware level, or operating system/file
> system level?
>
> Options include a hardware RAID controller in a RAID 0 stripe (this is more
> $$$ and for what gain?), or utilizing a volume manager like LVM.
>
> Along those same lines, if you do implement some type of striping, what
> RAID
> stripe size is recommended?  (I think Todd Burruss asked this earlier but I
> did not see a response)
>
> Thanks for any input!
>
> -Eric
>
>
>

Mime
View raw message