Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 8614 invoked from network); 30 Sep 2009 15:31:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 30 Sep 2009 15:31:53 -0000 Received: (qmail 96628 invoked by uid 500); 30 Sep 2009 15:31:53 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 96574 invoked by uid 500); 30 Sep 2009 15:31:53 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 96565 invoked by uid 99); 30 Sep 2009 15:31:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Sep 2009 15:31:52 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ikatkov@gmail.com designates 209.85.218.226 as permitted sender) Received: from [209.85.218.226] (HELO mail-bw0-f226.google.com) (209.85.218.226) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Sep 2009 15:31:43 +0000 Received: by bwz26 with SMTP id 26so1635064bwz.12 for ; Wed, 30 Sep 2009 08:31:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:content-type; bh=y2ovjIiKHBhF3AOr0Zd+xC3kVHC26WzWiuMOO08lJgs=; b=VnTu1w6lKDSU8GPZleeU2aTbpdH3OTVwLNFlFwedd68NaBB5PeBFpNnu9NYhmUV6+v IxOma/CnKgO/WnHe9gHm+C+6qpgIVP2unKpwGdJoY2K0GNPtfk42LV69YUqCaGnbXRvG iWqF+q72NePOSr5bAOV4hp4g/45PjdIUiJKgE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=IaSvL+3X9qh4TGs7rkNdzLxOxy3aA+q02O1Ti3GvDqyKjdQ0IRria2jVBil/yxjPSc JrgSGjDlab3L71mH0PLW61Sq+RgbieAAVEC0/4TMl4v95ulv/bj9cXNG1HPAWyOt9SrM H6HF4pKtJSOt8zBzoeMglmXls384V1s8fuMDM= MIME-Version: 1.0 Received: by 10.223.97.132 with SMTP id l4mr1777369fan.100.1254324683134; Wed, 30 Sep 2009 08:31:23 -0700 (PDT) In-Reply-To: References: <23b1e84e0909291222n66d791agd500a22a8b93d437@mail.gmail.com> From: Igor Katkov Date: Wed, 30 Sep 2009 11:31:03 -0400 Message-ID: <23b1e84e0909300831v498154f9x11283c9d87fd5bfe@mail.gmail.com> Subject: Re: data distribution among DataFileDirectories To: cassandra-user@incubator.apache.org Content-Type: multipart/alternative; boundary=0015174bdcc82bd7a10474cd3994 X-Virus-Checked: Checked by ClamAV on apache.org --0015174bdcc82bd7a10474cd3994 Content-Type: text/plain; charset=ISO-8859-1 OK, I figure it out Compaction always writes to hard drive that has more free space. see DatabaseDescriptor.getDataFileLocationForTable(String table, long expectedCompactedFileSize) That is how all the data might end up in one folder only. In the long run, provided that one has enough data, biggest hard drive will always be stressed the most, that affects read performance. On Tue, Sep 29, 2009 at 3:46 PM, Jonathan Ellis wrote: > On Tue, Sep 29, 2009 at 2:22 PM, Igor Katkov wrote: > > Does cassandra distributes keys evenly among DataFileDirectories? > > No, but it should distribute sstables evenly (which, on average, > should be distributing keys evenly, but there will be large variance). > > > Questions: > > 1. Is it by design? > > Each time a sstable is created, either by flush or compaction, it > should pick the "next" directory to use. > > > 2. Is there a way to control key distribution, for the cases when > > hard-drives are of different capacity? > > No. (That wouldn't be hard to add, but nobody's needed it.) > > -Jonathan > --0015174bdcc82bd7a10474cd3994 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable OK, I figure it out
Compaction always writes to hard drive that has more= free space.
see DatabaseDescriptor.getDataFileLocationForTable(String = table, long expectedCompactedFileSize)

That is how all the data migh= t end up in one folder only.
In the long run, provided that one has enough data, biggest hard drive wil= l always be stressed the most, that affects read performance.


On Tue, Sep 29, 2009 at 3:46 PM, Jonathan Elli= s <jbellis@gmail.= com> wrote:
On Tue, Sep 29, 2009 at 2:22 PM, Igor Katkov <ikatkov@gmail.com> wrote:
> Does cassandra distributes keys evenly among DataFileDirectories?

No, but it should distribute sstables evenly (which, on average,
should be distributing keys evenly, but there will be large variance).

> Questions:
> 1. Is it by design?

Each time a sstable is created, either by flush or compaction, it
should pick the "next" directory to use.

> 2. Is there a way to control key distribution, for the cases when
> hard-drives are of different capacity?

No. =A0(That wouldn't be hard to add, but nobody's needed it.= )

-Jonathan

--0015174bdcc82bd7a10474cd3994--