Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: local policy)
From: "Hiller, Dean" <Dean.Hiller@nrel.gov>
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Thu, 28 Feb 2013 19:40:27 -0700
Subject: Re: best way to clean up a column family? 60Gig of dangling data
Thread-Topic: best way to clean up a column family? 60Gig of dangling data
Thread-Index: Ac4WJiLc/u1xqfCvTlenpFdtDtHxJg==
Message-ID: <CD556324.21B13%Dean.Hiller@nrel.gov>
In-Reply-To: <512FFC9B.9050200@cj.com>
Accept-Language: en-US
Content-Language: en-US
user-agent: Microsoft-MacOutlook/14.3.1.130117
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

Cool, thanks for the trick.
Dean

On 2/28/13 5:55 PM, "Erik Forkalsud" <eforkalsrud@cj.com> wrote:

>
>Have you tried to (via jmx) call
>org.apache.cassandra.db.CompactionManager.forceUserDefinedCompaction()
>and give it the name of your SSTable file.
>
>It's a trick I use to aggressively get rid of expired data, i.e. if I
>have a column family where all data is written with a TTL of 30 days,
>any SSTable files with last modified time of more than 30 days ago will
>have only expired data, so I call the above function to compact those
>files one by one.  In your case it sounds like it's not expired data,
>but data that belongs on other nodes that you want to get rid of.  I'm
>not sure if compaction will drop data that doesn't fall within the nodes
>key range, but if it does this method should have the effect you're after.
>
>
>- Erik -
>
>
>On 02/27/2013 08:51 PM, Hiller, Dean wrote:
>> Okay, we had 6 nodes of 130Gig and it was slowly increasing.  Through
>>our operations to modify bloomfilter fp chance, we screwed something up
>>as trying to relieve memory pressures was tough.  Anyways, somehow, this
>>caused nodes 1, 2, and 3 to jump to around 200Gig and our incoming data
>>stream is completely constant at around 260 points/second.
>>
>> Sooo, we know this dangling data(around 60Gigs) is in one single column
>>family.  Node 1, 2, and 3 is for the first token range according to
>>ringdescribe.  It is almost like the issue is now replicated to the
>>other two nodes.  Is there any way we can go about debugging this and
>>release the 60 gigs of disk space?
>>
>> Also, the upgradesstables when memory is already close to max is not
>>working too well.  Can we do this instead(ie. Is it safe?)?
>>
>>   1.  Bring down the node
>>   2.  Move all the *Index.db files to another directory
>>   3.  Start the node and run upgradesstables
>>
>> We know this relieves a ton of memory out of the gate for us.  We are
>>trying to get memory back down by a gig, then upgrade to 1.2.2 and
>>switch to leveled compaction as we have ZERO I/o really going on most of
>>the time and really just have this bad bad memory bottleneck(iostat
>>shows nothing typically as we are bottlenecked by memory).
>>
>> Thanks,
>> Dean
>