Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 72809 invoked from network); 26 Apr 2010 16:53:38 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 26 Apr 2010 16:53:38 -0000 Received: (qmail 36877 invoked by uid 500); 26 Apr 2010 16:53:38 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 36857 invoked by uid 500); 26 Apr 2010 16:53:38 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 36849 invoked by uid 99); 26 Apr 2010 16:53:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Apr 2010 16:53:38 +0000 X-ASF-Spam-Status: No, hits=-0.2 required=10.0 tests=AWL,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ryan@twitter.com designates 209.85.212.44 as permitted sender) Received: from [209.85.212.44] (HELO mail-vw0-f44.google.com) (209.85.212.44) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Apr 2010 16:53:32 +0000 Received: by vws13 with SMTP id 13so1325508vws.31 for ; Mon, 26 Apr 2010 09:53:11 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.239.149 with SMTP id kw21mr5189789qcb.99.1272300791193; Mon, 26 Apr 2010 09:53:11 -0700 (PDT) Received: by 10.229.211.78 with HTTP; Mon, 26 Apr 2010 09:53:10 -0700 (PDT) In-Reply-To: References: Date: Mon, 26 Apr 2010 09:53:10 -0700 Message-ID: Subject: Re: Can Cassandra make real use of several DataFileDirectories? From: Ryan King To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I would recommend using RAID-0 rather that multiple data directories. -ryan 2010/4/26 Roland H=E4nel : > I have a configuration like this: > > =A0 > =A0=A0=A0=A0=A0 /storage01/cassandra/data > =A0=A0=A0=A0=A0 /storage02/cassandra/data > =A0=A0=A0=A0=A0 /storage03/cassandra/data > =A0 > > After loading a big chunk of data into cassandra, I end up wich some 70GB= in > the first directory, and only about 10GB in the second and third one. All > rows are quite small, so it's not just some big rows that contain the > majority of data. > > Does Cassandra have the ability to 'see' the maximum available space in > these directory? I'm asking myself this question since my limit is 100GB, > and the first directory is approaching this limit... > > And, wouldn't it be better if Cassandra tried to 'load-balance' the files > inside the directories because this will result in better (read) performa= nce > if the directories are on different disks (which is the case for me)? > > Any help is appreciated. > > Roland > >