Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 54240D232 for ; Fri, 1 Mar 2013 00:56:17 +0000 (UTC) Received: (qmail 95587 invoked by uid 500); 1 Mar 2013 00:56:14 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 95490 invoked by uid 500); 1 Mar 2013 00:56:14 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 95480 invoked by uid 99); 1 Mar 2013 00:56:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Mar 2013 00:56:14 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of eforkalsrud@cj.com designates 64.70.58.141 as permitted sender) Received: from [64.70.58.141] (HELO smtp.vclk.net) (64.70.58.141) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Mar 2013 00:56:07 +0000 Received: from mip.netscaler7-8.la.vclk.net (HELO foxy.cj.com) ([192.168.137.118]) by smtp.vclk.net with ESMTP; 28 Feb 2013 16:55:44 -0800 Received: from sb-erikf.corp.valueclick.com (sb-erikf.corp.valueclick.com [192.168.15.170] (may be forged)) by foxy.cj.com (8.11.0/8.11.0) with ESMTP id r211tjW00762 for ; Thu, 28 Feb 2013 17:55:46 -0800 Message-ID: <512FFC9B.9050200@cj.com> Date: Thu, 28 Feb 2013 16:55:55 -0800 From: Erik Forkalsud User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130219 Thunderbird/17.0.3 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: best way to clean up a column family? 60Gig of dangling data References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Have you tried to (via jmx) call org.apache.cassandra.db.CompactionManager.forceUserDefinedCompaction() and give it the name of your SSTable file. It's a trick I use to aggressively get rid of expired data, i.e. if I have a column family where all data is written with a TTL of 30 days, any SSTable files with last modified time of more than 30 days ago will have only expired data, so I call the above function to compact those files one by one. In your case it sounds like it's not expired data, but data that belongs on other nodes that you want to get rid of. I'm not sure if compaction will drop data that doesn't fall within the nodes key range, but if it does this method should have the effect you're after. - Erik - On 02/27/2013 08:51 PM, Hiller, Dean wrote: > Okay, we had 6 nodes of 130Gig and it was slowly increasing. Through our operations to modify bloomfilter fp chance, we screwed something up as trying to relieve memory pressures was tough. Anyways, somehow, this caused nodes 1, 2, and 3 to jump to around 200Gig and our incoming data stream is completely constant at around 260 points/second. > > Sooo, we know this dangling data(around 60Gigs) is in one single column family. Node 1, 2, and 3 is for the first token range according to ringdescribe. It is almost like the issue is now replicated to the other two nodes. Is there any way we can go about debugging this and release the 60 gigs of disk space? > > Also, the upgradesstables when memory is already close to max is not working too well. Can we do this instead(ie. Is it safe?)? > > 1. Bring down the node > 2. Move all the *Index.db files to another directory > 3. Start the node and run upgradesstables > > We know this relieves a ton of memory out of the gate for us. We are trying to get memory back down by a gig, then upgrade to 1.2.2 and switch to leveled compaction as we have ZERO I/o really going on most of the time and really just have this bad bad memory bottleneck(iostat shows nothing typically as we are bottlenecked by memory). > > Thanks, > Dean