incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Watanabe Maki <watanabe.m...@gmail.com>
Subject Re: Snapshotting to a different volume?
Date Thu, 19 May 2011 01:33:35 GMT
Please note that all files on unix file system are basically hard links referring specific
inode. If you make a hard link to a file, it means the inode has two referring names.
When the SSTable is compacted and GCed, Cassandra "delete" the old SSTable but keep snapshot.
Now the reference count to the inode become one.
Big advantage of hard link is that you don't need copy the data. So snapshot completes very
fast. If you need separate copy of the snapshot in different volume, you can write a script
to copy them.

From iPhone


On 2011/05/19, at 10:17, Sameer Farooqui <cassandralabs@gmail.com> wrote:

> Ahh.. yeah. And during a compaction a new SSTable is created with the merged data.
> 
> So, if I take a snapshot before compaction, the old SSTables won't be deleted (b/c the
snapshot hard links still have a reference to the files).
> 
> But if I hadn't taken a snapshot before compaction, does compaction also automatically
delete the old SSTables?
> 
> FYI - The O'Reilly Cassandra book has a really misleading definition of snapshotting.
It says that a snapshot makes a copy of the keyspace and saves it to a separate database file.
> 
> Is there a way to do a read from a snapshot? So, using a client like Hector, can I request
a read from a snapshot that I took like 2 weeks ago?
> 
> It sounds like the main benefit of a snapshot is that it makes a bunch of nicely organized
hard links in a separate folder from a specific point in time.
> 
> So, taking snapshots to a different volume doesn't make sense since the hard links can't
span file systems. But it would be nice to have a feature where the entire point-in-time copy
of the SSTables can be copied to a different volume. Currently if the data volume gets corrupted,
the snapshots on it can also get corrupted.
> 
> On Wed, May 18, 2011 at 5:44 PM, Watanabe Maki <watanabe.maki@gmail.com> wrote:
> SSTables are immutable. Those won't changed once written to disk.
> 
> From iPhone
> 
> 
> On 2011/05/19, at 9:37, Sameer Farooqui <cassandralabs@gmail.com> wrote:
> 
>> As of 0.8.0, is it possible to take a Cassandra snapshot to a different volume (like
a EBS volume dedicated for backups)?
>> 
>> About a year ago, Jonathan Ellis said that this won't be implemented b/c snapshots
are basically hard links:
>> http://mail-archives.apache.org/mod_mbox/cassandra-commits/201002.mbox/%3C821546961.221891265939548009.JavaMail.jira@brutus.apache.org%3E
>> 
>> But I don't fully understand that. If a snapshot is just a hardlink, won't the snapshot
also change as new data is written to the SSTables?
>> 
>> - Sameer
> 

Mime
View raw message