incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <Dean.Hil...@nrel.gov>
Subject Re: best way to clean up a column family? 60Gig of dangling data
Date Fri, 01 Mar 2013 02:40:27 GMT
Cool, thanks for the trick.
Dean

On 2/28/13 5:55 PM, "Erik Forkalsud" <eforkalsrud@cj.com> wrote:

>
>Have you tried to (via jmx) call
>org.apache.cassandra.db.CompactionManager.forceUserDefinedCompaction()
>and give it the name of your SSTable file.
>
>It's a trick I use to aggressively get rid of expired data, i.e. if I
>have a column family where all data is written with a TTL of 30 days,
>any SSTable files with last modified time of more than 30 days ago will
>have only expired data, so I call the above function to compact those
>files one by one.  In your case it sounds like it's not expired data,
>but data that belongs on other nodes that you want to get rid of.  I'm
>not sure if compaction will drop data that doesn't fall within the nodes
>key range, but if it does this method should have the effect you're after.
>
>
>- Erik -
>
>
>On 02/27/2013 08:51 PM, Hiller, Dean wrote:
>> Okay, we had 6 nodes of 130Gig and it was slowly increasing.  Through
>>our operations to modify bloomfilter fp chance, we screwed something up
>>as trying to relieve memory pressures was tough.  Anyways, somehow, this
>>caused nodes 1, 2, and 3 to jump to around 200Gig and our incoming data
>>stream is completely constant at around 260 points/second.
>>
>> Sooo, we know this dangling data(around 60Gigs) is in one single column
>>family.  Node 1, 2, and 3 is for the first token range according to
>>ringdescribe.  It is almost like the issue is now replicated to the
>>other two nodes.  Is there any way we can go about debugging this and
>>release the 60 gigs of disk space?
>>
>> Also, the upgradesstables when memory is already close to max is not
>>working too well.  Can we do this instead(ie. Is it safe?)?
>>
>>   1.  Bring down the node
>>   2.  Move all the *Index.db files to another directory
>>   3.  Start the node and run upgradesstables
>>
>> We know this relieves a ton of memory out of the gate for us.  We are
>>trying to get memory back down by a gig, then upgrade to 1.2.2 and
>>switch to leveled compaction as we have ZERO I/o really going on most of
>>the time and really just have this bad bad memory bottleneck(iostat
>>shows nothing typically as we are bottlenecked by memory).
>>
>> Thanks,
>> Dean
>


Mime
View raw message