cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <>
Subject Re: Backup/Restore: Coordinating Cassandra Nodetool Snapshots with Amazon EBS Snapshots?
Date Thu, 23 Jun 2011 16:32:28 GMT
> A snippet from the wikipedia page on XFS for example:
> ...
> Snapshots
> XFS does not provide direct support for snapshots, as it expects the
> snapshot process to be implemented by the volume manager. Taking a snapshot
> of an XFS filesystem involves freezing I/O to the filesystem using
> the xfs_freeze utility, having the volume manager perform the actual
> snapshot, and then unfreezing I/O to resume normal operations. The snapshot
> can then be mounted read-only for backup purposes. XFS releases on IRIX
> incorporated an integrated volume manager called XLV. This volume manager
> has not been ported to Linux and XFS works with standard LVM instead. In
> recent Linux kernels, thexfs_freeze functionality is implemented in the VFS
> layer, and happens automatically when the Volume Manager's snapshot
> functionality is invoked. This was once a valuable advantage as Ext3 system
> could not be suspended[4] and volume manager was unable to create a
> consistent 'hot' snapshot to backup a heavily busy database.[5] Fortunately
> this is no longer the case. Since Linux 2.6.29 ext3, ext4, gfs2 and jfs have
> the freeze feature as well.[6]
> ...

The above is misleading, at least when read out of context (I didn't
check the article). The only hint that the freezing is only necessary
with non-atomic snapshots is in the "... and volumen manager was
unable to create a consistent hot snapshot" part.

> I haven't touched the linux kernel for many years now, so I honestly I'm
> talking about what I've read in the last few years (rather than relying on
> the actual kernel/drivers code). But if I have to trust this and many other
> articles like it, I'm interpreting that freezing the FS (directly or
> indirectly by LVM) is, indeed, necessary. Not just for XFS but for other
> log-based filesystems. Honestly speaking, I'm not sure if the exact
> technical reason why...maybe it is to stop reads to the actual device, or to
> ensure some sort of log flushing depending on your settings, ... etc.
> dunno...erhaps somebody else knows and want to share it.

It's wrong, no matter how many places people claim it. The problem is
that with storage, there are *lots* of urban legends and people making
strange claims. In this case it is wrong for fundamental reasons
independent of kernel implementation details.

While the freezing may very well have been empirically needed in some
particular case, reasons include write barriers not propagating, using
the fs on a multi-device array where global snapshots are not
supported, etc - but fundamentally, an atomic snapshot will yield a
consistent (or recoverable) file system if the file system is
crash-consistent AND is used/configured/accessed in such a way to
actually be crash-consistent for real (e.g., disabling synchronous
writes, not propagating write barriers due to lvm, are examples where
it's not).

Of course I realize I am just another screaming voice. I guess this is
another blog entry to write and explain this in detail. I should
really start working off the backlog of those blog entries...

/ Peter Schuller

View raw message