Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6E5EAD1D0 for ; Thu, 16 Aug 2012 22:58:32 +0000 (UTC) Received: (qmail 76191 invoked by uid 500); 16 Aug 2012 22:58:30 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 76146 invoked by uid 500); 16 Aug 2012 22:58:30 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 76138 invoked by uid 99); 16 Aug 2012 22:58:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Aug 2012 22:58:30 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a55.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Aug 2012 22:58:22 +0000 Received: from homiemail-a55.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a55.g.dreamhost.com (Postfix) with ESMTP id 2E05C12C0E0 for ; Thu, 16 Aug 2012 15:58:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :content-type:message-id:mime-version:subject:date:references:to :in-reply-to; s=thelastpickle.com; bh=rURSQ3E5VpZFv31wuQABG67hDy 4=; b=Y/YelMJYyjzwN3dgxhN5Sn1EF7YSIMW/Lk5+AV88G+//4pArmgSj4sleDz uo2UN5P4AxLOkpI2VTo1mK/ZdW5xRB2KGo9sh3bqVLAcC9Mz4U4MDgd+HePxReTo LsG7HHMFkx6Ftzi+9rjQliKEwD9wIv2dZ2MH2AJZINCknpBps= Received: from [192.168.2.77] (unknown [116.90.132.105]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a55.g.dreamhost.com (Postfix) with ESMTPSA id A559412C0A7 for ; Thu, 16 Aug 2012 15:57:59 -0700 (PDT) From: aaron morton Content-Type: multipart/alternative; boundary="Apple-Mail=_1CE1F33A-01C9-4EA8-961E-34526E020686" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 6.0 \(1485\)) Subject: Re: nodetool repair uses insane amount of disk space Date: Fri, 17 Aug 2012 10:57:57 +1200 References: To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1485) --Apple-Mail=_1CE1F33A-01C9-4EA8-961E-34526E020686 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 What version are using ? There were issues with repair using = lots-o-space in 0.8.X, it's fixed in 1.X Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 17/08/2012, at 2:56 AM, Michael Morris = wrote: > Occasionally as I'm doing my regular anti-entropy repair I end up with = a node that uses an exceptional amount of disk space (node should have = about 5-6 GB of data on it, but ends up with 25+GB, and consumes the = limited amount of disk space I have available) >=20 > How come a node would consume 5x its normal data size during the = repair process? >=20 > My setup is kind of strange in that it's only about 80-100GB of data = on a 35 node cluster, with 2 data centers and 3 racks, however the rack = assignments are unbalanced. One data center has 8 nodes, and the other = data center is split into 2 racks with one rack of 9 nodes, and the = other with 18 nodes. However, within each rack, the tokens are = distributed equally. It's a long sad story about how we ended up this = way, but it basically boils down to having to utilize existing resources = to resolve a production issue. >=20 > Additionally, the repair process takes (what I feel is) an extremely = long time to complete (36+ hours), and it always seems that nodes are = streaming data to each other, even on back-to-back executions of the = repair. >=20 > Any help on these issues is appreciated. >=20 > - Mike >=20 --Apple-Mail=_1CE1F33A-01C9-4EA8-961E-34526E020686 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 What = version are using ? There were issues with repair using lots-o-space in = 0.8.X, it's fixed in 1.X

Cheers

http://www.thelastpickle.com

On 17/08/2012, at 2:56 AM, Michael Morris <michael.m.morris@gmail.com&= gt; wrote:

Occasionally as I'm doing my regular anti-entropy repair I = end up with a node that uses an exceptional amount of disk space (node = should have about 5-6 GB of data on it, but ends up with 25+GB, and = consumes the limited amount of disk space I have available)

How come a node would consume 5x its normal data size during the = repair process?

My setup is kind of strange in that it's only = about 80-100GB of data on a 35 node cluster, with 2 data centers and 3 = racks, however the rack assignments are unbalanced.  One data = center has 8 nodes, and the other data center is split into 2 racks with = one rack of 9 nodes, and the other with 18 nodes.  However, within = each rack, the tokens are distributed equally. It's a long sad story = about how we ended up this way, but it basically boils down to having to = utilize existing resources to resolve a production issue.

Additionally, the repair process takes (what I feel is) an extremely = long time to complete (36+ hours), and it always seems that nodes are = streaming data to each other, even on back-to-back executions of the = repair.

Any help on these issues is appreciated.

- Mike


= --Apple-Mail=_1CE1F33A-01C9-4EA8-961E-34526E020686--