Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 33464 invoked from network); 9 Dec 2010 20:14:52 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 9 Dec 2010 20:14:52 -0000 Received: (qmail 30241 invoked by uid 500); 9 Dec 2010 20:14:50 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 30218 invoked by uid 500); 9 Dec 2010 20:14:50 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 30210 invoked by uid 99); 9 Dec 2010 20:14:50 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Dec 2010 20:14:50 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of svd@mylife.com designates 208.147.48.84 as permitted sender) Received: from [208.147.48.84] (HELO razor.reunion.com) (208.147.48.84) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Dec 2010 20:14:43 +0000 Received: from maximus.reunion.com (EHLO maximus.REUNION.COM) ([192.168.0.71]) by razor.reunion.com (MOS 4.1.9-GA FastPath queued) with ESMTP id BDZ89124; Thu, 09 Dec 2010 12:14:22 -0800 (PST) Received: from free-207.corp.wink.com ([192.168.1.207]) by maximus.REUNION.COM with Microsoft SMTPSVC(6.0.3790.3959); Thu, 9 Dec 2010 12:11:39 -0800 Date: Thu, 9 Dec 2010 12:13:28 -0800 (PST) From: Scott Dworkis X-X-Sender: svd@svdpc To: Rustam Aliyev cc: user@cassandra.apache.org Subject: Re: Cassandra and disk space In-Reply-To: <4D011DDA.30404@code.az> Message-ID: References: <4D011DDA.30404@code.az> User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="8323329-588802973-1291925233=:13323" Content-ID: X-OriginalArrivalTime: 09 Dec 2010 20:11:39.0747 (UTC) FILETIME=[4A218B30:01CB97DD] This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --8323329-588802973-1291925233=:13323 Content-Type: TEXT/PLAIN; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 8BIT Content-ID: i recently finished a practice expansion of 4 nodes to 5 nodes, a series of "nodetool move", "nodetool cleanup" and jmx gc steps. i found that in some of the steps, disk usage actually grew to 2.5x the base data size on one of the nodes. i'm using 0.6.4. -scott On Thu, 9 Dec 2010, Rustam Aliyev wrote: > Is there any plans to improve this in future? > > For big data clusters this could be very expensive. Based on your comment, I will need 200TB of storage for 100TB of data to keep Cassandra running. > > -- > Rustam. > > On 09/12/2010 17:56, Tyler Hobbs wrote: > If you are on 0.6, repair is particularly dangerous with respect to disk space usage.� If your replica is sufficiently out of sync, you can > triple your disk usage pretty easily.� This has been improved in 0.7, so repairs should use about half as much disk space, on average. > > In general, yes, keep your nodes under 50% disk usage at all times.� Any of: compaction, cleanup, snapshotting, repair, or bootstrapping (the > latter two are improved in 0.7) can double your disk usage temporarily. > > You should plan to add more disk space or add nodes when you get close to this limit.� Once you go over 50%, it's more difficult to add nodes, > at least in 0.6. > > - Tyler > > On Thu, Dec 9, 2010 at 11:19 AM, Mark wrote: > I recently ran into a problem during a repair operation where my nodes completely ran out of space and my whole cluster was... > well, clusterfucked. > > I want to make sure how to prevent this problem in the future. > > Should I make sure that at all times every node is under 50% of its disk space? Are there any normal day-to-day operations that > would cause the any one node to double in size that I should be aware of? If on or more nodes to surpass the 50% mark, what should > I plan to do? > > Thanks for any advice > > > > --8323329-588802973-1291925233=:13323--