incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremiah Jordan" <JEREMIAH.JOR...@morningstar.com>
Subject RE: Backups, Snapshots, SSTable Data Files, Compaction
Date Tue, 07 Jun 2011 15:27:14 GMT
Don't manually delete things.  Let Cassandra do it.  Force a garbage
collection or restart your instance and Cassandra will delete the unused
files.

-----Original Message-----
From: AJ [mailto:aj@dude.podzone.net] 
Sent: Tuesday, June 07, 2011 10:15 AM
To: user@cassandra.apache.org
Subject: Re: Backups, Snapshots, SSTable Data Files, Compaction

On 6/7/2011 2:29 AM, Maki Watanabe wrote:
> You can find useful information in:
> http://www.datastax.com/docs/0.8/operations/scheduled_tasks
>
> sstables are immutable. Once it written to disk, it won't be updated.
> When you take snapshot, the tool makes hard links to sstable files.
> After certain time, you will have some times of memtable flushs, so 
> your sstable files will be merged, and obsolete sstable files will be 
> removed. But snapshot set will remains on your disk, for backup.
>

Thanks for the doc source.  I will be experimenting with 0.8.0 since it
has many features I've been waiting for.

But, still, if the snapshots don't link to all of the previous sets of
.db files, then those unlinked previous file sets MUST be safe to
manually delete.  But, they aren't deleted until later after a GC.  It's
a bit confusing why they are kept after compaction up until GC when they
seem to not be needed.  We have Big Data plans... one node can have 10's
of TBs, so I'm trying to get an idea of how much disk space will be
required and whether or not I can free-up some disk space.

Hopefully someone can still elaborate on this.



Mime
View raw message