Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 3853 invoked from network); 17 Aug 2010 02:34:19 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 17 Aug 2010 02:34:19 -0000 Received: (qmail 86712 invoked by uid 500); 17 Aug 2010 02:34:18 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 86566 invoked by uid 500); 17 Aug 2010 02:34:16 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 86557 invoked by uid 99); 17 Aug 2010 02:34:15 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Aug 2010 02:34:15 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=SPF_PASS,T_FILL_THIS_FORM_SHORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of svd@mylife.com designates 208.147.48.84 as permitted sender) Received: from [208.147.48.84] (HELO razor.reunion.com) (208.147.48.84) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Aug 2010 02:33:52 +0000 Received: from maximus.reunion.com (EHLO maximus.REUNION.COM) ([192.168.0.71]) by razor.reunion.com (MOS 4.1.9-GA FastPath queued) with ESMTP id ANL73027; Mon, 16 Aug 2010 19:07:00 -0700 (PDT) Received: from free-207.corp.wink.com ([192.168.1.207]) by maximus.REUNION.COM with Microsoft SMTPSVC(6.0.3790.3959); Mon, 16 Aug 2010 19:25:10 -0700 Date: Mon, 16 Aug 2010 19:26:43 -0700 (PDT) From: Scott Dworkis X-X-Sender: svd@svdpc To: user@cassandra.apache.org Subject: curious space usages after recovering a failed node Message-ID: User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII X-OriginalArrivalTime: 17 Aug 2010 02:25:10.0815 (UTC) FILETIME=[6AA9E2F0:01CB3DB3] X-Virus-Checked: Checked by ClamAV on apache.org i followed the alternative approach for handling a failed node here: http://wiki.apache.org/cassandra/Operations i.e. bringing up a replacement node with the same ip, bootstrapping it into the same token used by the failed node (using the InitialToken config parameter), then doing a repair. at the end of this process i had a data directory that was almost 3x the size of the directory on the failed node at the time of failure... i expected around 2x for copies going around, but 3x seems a bit high for headroom i should expect to need for recovery. the data i inserted here was 100 copies of an almost 10M file, random partitioner, no overwriting or anything, replication factor of 2. so i'd expect to be using around 2G. here is what ring and du looked like after the initial data load: Address Status Load Range Ring 170141183460469231731687303715884105728 10.3.0.84 Up 448.8 MB 42535295865117307932921825928971026432 |<--| 10.3.0.85 Up 374 MB 85070591730234615865843651857942052864 | | 10.3.0.114 Up 495 bytes 127605887595351923798765477786913079296 | | 10.3.0.115 Up 496 bytes 170141183460469231731687303715884105728 |-->| 655M /data/cassandra/ 655M /data/cassandra 655M /data/cassandra 1001M /data/cassandra so far so good... now after the bootstrap: Address Status Load Range Ring 170141183460469231731687303715884105728 10.3.0.84 Up 467.5 MB 42535295865117307932921825928971026432 |<--| 10.3.0.85 Up 205.7 MB 85070591730234615865843651857942052864 | | 10.3.0.114 Up 448.8 MB 127605887595351923798765477786913079296 | | 10.3.0.115 Up 514.25 MB 170141183460469231731687303715884105728 |-->| 674M /data/cassandra 206M /data/cassandra/ 655M /data/cassandra 767M /data/cassandra also reasonable, now after the repair: Address Status Load Range Ring 170141183460469231731687303715884105728 10.3.0.84 Up 467.5 MB 42535295865117307932921825928971026432 |<--| 10.3.0.85 Up 916.3 MB 85070591730234615865843651857942052864 | | 10.3.0.114 Up 654.5 MB 127605887595351923798765477786913079296 | | 10.3.0.115 Up 514.25 MB 170141183460469231731687303715884105728 |-->| 674M /data/cassandra 1.4G /data/cassandra/ 655M /data/cassandra 767M /data/cassandra so i need 3x headroom if i were to try this on a huge production data set? after 3 or 4 nodetool cleanups, ring looks ok, but the data directories have bloated: Address Status Load Range Ring 170141183460469231731687303715884105728 10.3.0.84 Up 467.5 MB 42535295865117307932921825928971026432 |<--| 10.3.0.85 Up 420.75 MB 85070591730234615865843651857942052864 | | 10.3.0.114 Up 448.8 MB 127605887595351923798765477786913079296 | | 10.3.0.115 Up 514.25 MB 170141183460469231731687303715884105728 |-->| 1.2G /data/cassandra 842M /data/cassandra/ 1.1G /data/cassandra 1.3G /data/cassandra so the question is, do i plan to need 3x headroom for node recoveries? -scott