cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roland Hänel <rol...@haenel.me>
Subject Re: Can Cassandra make real use of several DataFileDirectories?
Date Mon, 26 Apr 2010 19:01:48 GMT
Hm... I understand that RAID0 would help to create a bigger pool for
compactions. However, it might impact read performance: if I have several
CF's (with their SSTables), random read requests for the CF files that are
on separate disks will behave nicely - however if it's RAID0 then a random
read on any file will create a random read on all of the hard disks.
Correct?

-Roland

2010/4/26 Jonathan Ellis <jbellis@gmail.com>

> http://wiki.apache.org/cassandra/CassandraHardware
>
> On Mon, Apr 26, 2010 at 1:06 PM, Edmond Lau <edmond@ooyala.com> wrote:
> > Ryan -
> >
> > You (or maybe someone else) mentioned using RAID-0 instead of multiple
> > data directories at the Cassandra hackathon as well.  Could you
> > explain the motivation behind that?
> >
> > Thanks,
> > Edmond
> >
> > On Mon, Apr 26, 2010 at 9:53 AM, Ryan King <ryan@twitter.com> wrote:
> >> I would recommend using RAID-0 rather that multiple data directories.
> >>
> >> -ryan
> >>
> >> 2010/4/26 Roland Hänel <roland@haenel.me>:
> >>> I have a configuration like this:
> >>>
> >>>   <DataFileDirectories>
> >>>       <DataFileDirectory>/storage01/cassandra/data</DataFileDirectory>
> >>>       <DataFileDirectory>/storage02/cassandra/data</DataFileDirectory>
> >>>       <DataFileDirectory>/storage03/cassandra/data</DataFileDirectory>
> >>>   </DataFileDirectories>
> >>>
> >>> After loading a big chunk of data into cassandra, I end up wich some
> 70GB in
> >>> the first directory, and only about 10GB in the second and third one.
> All
> >>> rows are quite small, so it's not just some big rows that contain the
> >>> majority of data.
> >>>
> >>> Does Cassandra have the ability to 'see' the maximum available space in
> >>> these directory? I'm asking myself this question since my limit is
> 100GB,
> >>> and the first directory is approaching this limit...
> >>>
> >>> And, wouldn't it be better if Cassandra tried to 'load-balance' the
> files
> >>> inside the directories because this will result in better (read)
> performance
> >>> if the directories are on different disks (which is the case for me)?
> >>>
> >>> Any help is appreciated.
> >>>
> >>> Roland
> >>>
> >>>
> >>
> >
>

Mime
View raw message