cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <>
Subject Re: sync commitlog in batch mode lose data
Date Tue, 07 Jun 2011 15:23:32 GMT
> But I have another question, while I disable the disk cache but leave the cache write
mode write-back, how sync works ? Still write the data into the cache ? This issue may not
belong to the scope of discussion here  .

I'm not sure, it depends on at what level of abstraction you changed
to write-back and how it's implemented. Generally, the contract of an
fsync() is that whatever was written up to that point must be
persistent (i.e., readable by subsequent reads, even in case of a
power outtage/crash) when the call returns. This usually means:

(1) the userland app must flush buffers and write data to kernel (this
is done prior to fsync())
(2) the OS file system code needs to write whatever is necessary to
underlying block device(s)
(3) the underlying block device(s) need to be told to insert a write
barrier or flush caches depending
(4) the underlying block device itself must handle this correctly
  (a) for a non-battery-backed disk it means flushing the cache and
you have to wakt for that to happen - at minimum seek + rotational
  (b) for a battery-backed RAID device it typically is a NOOP if the
battery backup unit is working, as the raid controller cache is
considered persistent
  (c) for a raid device with caching turned off or the BBU being
inoperable, it usually means asking individual real drives to flush
their caches

However in general, I advise care since all sorts of little details
can derail this from working. For example if you have the kernel
driver configured not to propagate write barriers to the raid
controller, but the raid controller has BBU turned off but is still
caching, an fsync() would not work for the power outage case. Using
LVM in certain configurations can break write (at least up to not very
long ago, maybe fixed in newer kernels) barriers at the OS level - and
the list goes on.

/ Peter Schuller

View raw message