Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D0DA2642D for ; Tue, 7 Jun 2011 16:16:51 +0000 (UTC) Received: (qmail 51980 invoked by uid 500); 7 Jun 2011 16:16:49 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 51955 invoked by uid 500); 7 Jun 2011 16:16:49 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 51947 invoked by uid 99); 7 Jun 2011 16:16:49 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jun 2011 16:16:49 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ben.coverston@datastax.com designates 209.85.210.44 as permitted sender) Received: from [209.85.210.44] (HELO mail-pz0-f44.google.com) (209.85.210.44) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jun 2011 16:16:42 +0000 Received: by pzk5 with SMTP id 5so2873544pzk.31 for ; Tue, 07 Jun 2011 09:16:20 -0700 (PDT) Received: by 10.142.133.4 with SMTP id g4mr26112wfd.270.1307463380463; Tue, 07 Jun 2011 09:16:20 -0700 (PDT) Received: from Ben-Coverstons-MacBook-Pro.local ([173.150.87.120]) by mx.google.com with ESMTPS id l10sm409719wfk.21.2011.06.07.09.16.17 (version=SSLv3 cipher=OTHER); Tue, 07 Jun 2011 09:16:19 -0700 (PDT) Message-ID: <4DEE4ECD.1070407@datastax.com> Date: Tue, 07 Jun 2011 10:16:13 -0600 From: Benjamin Coverston User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: Backups, Snapshots, SSTable Data Files, Compaction References: <4DEDB132.4070100@dude.podzone.net> <4DEDB65C.203@datastax.com> <4DEDD135.6090800@dude.podzone.net> <4DEE4068.9090802@dude.podzone.net> In-Reply-To: <4DEE4068.9090802@dude.podzone.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hi AJ, Unfortunately, for storage capacity planning it's a bit of a guessing game. Until you run your load against it and profile the usage you just are not going to know for sure. I have seen cases where planning to have 50% excess capacity/node was plenty, and I have seen other extreme cases where 3x planned capacity was not enough when replica counts and entropy levels are high. Cassandra will _try_ to work within the resource restrictions that you give it, but keep in mind that if it has excess resources in terms of disk space it may be a bit more lazy than you would expect in getting rid of some of the extra files that are sitting around waiting to be deleted. You know if they are scheduled to be deleted have a .compacted marker. If you want to actually SEE this happen use the stress.java or stress.py tools and do several test runs with different workloads. I think actually watching it happen would be enlightening for you. Lastly while I have seen a few instances where people have chosen to use node sizes with 10's of TB, it is an unusual case. Most node sizing I have seen falls in the range of 20-250GB. Not to say that there aren't workloads where having many TB/Node doesn't work, but if you're planning to read from the data you're writing you do want to ensure that your working set is stored in memory. HTH, Ben On 6/7/11 9:14 AM, AJ wrote: > On 6/7/2011 2:29 AM, Maki Watanabe wrote: >> You can find useful information in: >> http://www.datastax.com/docs/0.8/operations/scheduled_tasks >> >> sstables are immutable. Once it written to disk, it won't be updated. >> When you take snapshot, the tool makes hard links to sstable files. >> After certain time, you will have some times of memtable flushs, so >> your sstable files will be merged, and obsolete sstable files will be >> removed. But snapshot set will remains on your disk, for backup. >> > > Thanks for the doc source. I will be experimenting with 0.8.0 since > it has many features I've been waiting for. > > But, still, if the snapshots don't link to all of the previous sets of > .db files, then those unlinked previous file sets MUST be safe to > manually delete. But, they aren't deleted until later after a GC. > It's a bit confusing why they are kept after compaction up until GC > when they seem to not be needed. We have Big Data plans... one node > can have 10's of TBs, so I'm trying to get an idea of how much disk > space will be required and whether or not I can free-up some disk space. > > Hopefully someone can still elaborate on this. > > -- Ben Coverston Director of Operations DataStax -- The Apache Cassandra Company http://www.datastax.com/