incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: manually removing sstable
Date Fri, 12 Jul 2013 08:25:28 GMT
That sounds sane to me. Couple of caveats:

* Remember that Expiring Columns turn into Tombstones and can only be purged after TTL and
gc_grace.
* Tombstones will only be purged if all fragments of a row are in the SStable(s) being compacted.


Cheers
  
-----------------
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 11/07/2013, at 10:17 PM, Theo Hultberg <theo@iconara.net> wrote:

> a colleague of mine came up with an alternative solution that also seems to work, and
I'd just like your opinion on if it's sound.
> 
> we run find to list all old sstables, and then use cmdline-jmxclient to run the forceUserDefinedCompaction
function on each of them, this is roughly what we do (but with find and xargs to orchestrate
it)
> 
>   java -jar cmdline-jmxclient-0.10.3.jar - localhost:7199 org.apache.cassandra.db:type=CompactionManager
forceUserDefinedCompaction=the_keyspace,db_file_name
> 
> the downside is that c* needs to read the file and do disk io, but the upside is that
it doesn't require a restart. c* does a little more work, but we can schedule that during
off-peak hours. another upside is that it feels like we're pretty safe from screwups, we won't
accidentally remove an sstable with live data, the worst case is that we ask c* to compact
an sstable with live data and end up with an identical sstable.
> 
> if anyone else wants to do the same thing, this is the full cron command:
> 
> 0 4 * * * find /path/to/cassandra/data/the_keyspace_name -maxdepth 1 -type f -name '*-Data.db'
-mtime +8 -printf "forceUserDefinedCompaction=the_keyspace_name,\%P\n" | xargs -t --no-run-if-empty
java -jar /usr/local/share/java/cmdline-jmxclient-0.10.3.jar - localhost:7199 org.apache.cassandra.db:type=CompactionManager
> 
> just change the keyspace name and the path to the data directory.
> 
> T#
> 
> 
> On Thu, Jul 11, 2013 at 7:09 AM, Theo Hultberg <theo@iconara.net> wrote:
> thanks a lot. I can confirm that it solved our problem too.
> 
> looks like the C* 2.0 feature is perfect for us.
> 
> T#
> 
> 
> On Wed, Jul 10, 2013 at 7:28 PM, Marcus Eriksson <krummas@gmail.com> wrote:
> yep that works, you need to remove all components of the sstable though, not just -Data.db
> 
> and, in 2.0 there is this:
> https://issues.apache.org/jira/browse/CASSANDRA-5228
> 
> /Marcus
> 
> 
> On Wed, Jul 10, 2013 at 2:09 PM, Theo Hultberg <theo@iconara.net> wrote:
> Hi,
> 
> I think I remember reading that if you have sstables that you know contain only data
that whose ttl has expired, it's safe to remove them manually by stopping c*, removing the
*-Data.db files and then starting up c* again. is this correct?
> 
> we have a cluster where everything is written with a ttl, and sometimes c* needs to compact
over a 100 gb of sstables where we know ever has expired, and we'd rather just manually get
rid of those.
> 
> T#
> 
> 
> 


Mime
View raw message