Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 17B9FDEA5 for ; Thu, 16 Aug 2012 14:56:46 +0000 (UTC) Received: (qmail 4637 invoked by uid 500); 16 Aug 2012 14:56:43 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 4617 invoked by uid 500); 16 Aug 2012 14:56:43 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 4609 invoked by uid 99); 16 Aug 2012 14:56:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Aug 2012 14:56:43 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of michael.m.morris@gmail.com designates 209.85.214.44 as permitted sender) Received: from [209.85.214.44] (HELO mail-bk0-f44.google.com) (209.85.214.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Aug 2012 14:56:37 +0000 Received: by bkty7 with SMTP id y7so1146760bkt.31 for ; Thu, 16 Aug 2012 07:56:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=SKBFNhB4ckwUeiz0XgKJC6ZWi+hFJXT+Kp3AvF7j/tQ=; b=nmw/DPl0G6QArdrUwJdOrKJF4MBq7MBdhncEC6jvNsITDA06fAivYb9kdSfXhr273t xXRtHLA5zneb/1R/s/AM3rGwve9ZCQC7tK5vdjyeLdxR/L9Uu+S0YyzolqGBg6dydo6T r/uyEEHNhk0rHd+A77t85ZBmXAD/DYdEvP3c7asDXqY3J7136E4jtV4TwihPB0i/X1d4 8MvVaI5XLh2D0QxGSphGUxSMBTD5IRcUz61kyKIyHInEejiPI2kQ4PoLyphazZVmtagX 8SfQkDUpdoYnrDqIHR7Htz03s3OjrdI6d5nik6c5hlmj09MCGow/mzaRurH8Ma1CuvBM O5Gg== MIME-Version: 1.0 Received: by 10.204.152.19 with SMTP id e19mr711928bkw.8.1345128975870; Thu, 16 Aug 2012 07:56:15 -0700 (PDT) Received: by 10.204.13.25 with HTTP; Thu, 16 Aug 2012 07:56:15 -0700 (PDT) Date: Thu, 16 Aug 2012 09:56:15 -0500 Message-ID: Subject: nodetool repair uses insane amount of disk space From: Michael Morris To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=0015175d0406c8a99f04c7633e4e X-Virus-Checked: Checked by ClamAV on apache.org --0015175d0406c8a99f04c7633e4e Content-Type: text/plain; charset=ISO-8859-1 Occasionally as I'm doing my regular anti-entropy repair I end up with a node that uses an exceptional amount of disk space (node should have about 5-6 GB of data on it, but ends up with 25+GB, and consumes the limited amount of disk space I have available) How come a node would consume 5x its normal data size during the repair process? My setup is kind of strange in that it's only about 80-100GB of data on a 35 node cluster, with 2 data centers and 3 racks, however the rack assignments are unbalanced. One data center has 8 nodes, and the other data center is split into 2 racks with one rack of 9 nodes, and the other with 18 nodes. However, within each rack, the tokens are distributed equally. It's a long sad story about how we ended up this way, but it basically boils down to having to utilize existing resources to resolve a production issue. Additionally, the repair process takes (what I feel is) an extremely long time to complete (36+ hours), and it always seems that nodes are streaming data to each other, even on back-to-back executions of the repair. Any help on these issues is appreciated. - Mike --0015175d0406c8a99f04c7633e4e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Occasionally as I'm doing my regular anti-entropy repair I end up with = a node that uses an exceptional amount of disk space (node should have abou= t 5-6 GB of data on it, but ends up with 25+GB, and consumes the limited am= ount of disk space I have available)

How come a node would consume 5x its normal data size during the repair= process?

My setup is kind of strange in that it's only about 80= -100GB of data on a 35 node cluster, with 2 data centers and 3 racks, howev= er the rack assignments are unbalanced.=A0 One data center has 8 nodes, and= the other data center is split into 2 racks with one rack of 9 nodes, and = the other with 18 nodes.=A0 However, within each rack, the tokens are distr= ibuted equally. It's a long sad story about how we ended up this way, b= ut it basically boils down to having to utilize existing resources to resol= ve a production issue.

Additionally, the repair process takes (what I feel is) an extremely lo= ng time to complete (36+ hours), and it always seems that nodes are streami= ng data to each other, even on back-to-back executions of the repair.

Any help on these issues is appreciated.

- Mike

--0015175d0406c8a99f04c7633e4e--