incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <peter.schul...@infidyne.com>
Subject Re: Backup/Restore: Coordinating Cassandra Nodetool Snapshots with Amazon EBS Snapshots?
Date Thu, 23 Jun 2011 16:00:09 GMT
> If taking an atomic snapshot of the device on which a file system is
> located on, assuming the file system is designed to be crash
> consistent, it *has* to result in a consistent snapshot. Anything else
> would directly violate the claim that the file system is crash
> consistent, making the premise false.

Let me clarify. Crash-consistent file systems work like that by
relying on write barriers. This is what is exposed by fsync() to
userland (fsync() actually provides full durability guarantees, not
just write barriers, but for the purpose of consistency, it is the
write barrier you are interested in).

A write barrier is such that given a sequence of events like:

(1) write X
(2) insert write barrier
(3) write Y

It is guaranteed that if Y is written (i.e., readable in the future)
then X is also written.

It is upon this underlying guarantee that file systems like xfs,
ext4fs, zfs do their job. Their consistency semantics rely on this
behavior, and it is what it allows the file system to be
crash-consistent. In other words, in a timeline of writes, at any
given moment you can pause/crash/restart causing a sudden interruption
of the I/O. This has to lead to a directly consistent, or a
deterministically recoverable state, in order for the file system to
be called crash-consistent.

The "event" of suddenly interrupting I/O can be caused by several
things, such as a kernel panic (some assertion) pausing all kernel
activity, a power outtage causing a restart, an LVM atomic snapshot
being taken (in which case the I/O stops in the timeline of the
snapshot), or an EBS snapshot.

Only if the EBS snapshos are not consistent, or write barriers are
somehow violated on the EBS volume, would an EBS snapshot not be
consistent. Freezing is not required. But again, see my previous post
about freeze maybe being probabilistically useful anyway.

-- 
/ Peter Schuller

Mime
View raw message