incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <peter.schul...@infidyne.com>
Subject Re: sync commitlog in batch mode lose data
Date Fri, 03 Jun 2011 08:49:59 GMT
> I disable the disk cache of RAID controller,  unfortunately it still lost
> some data.

Disabling caching shouldn't be necessary so much as ensuring that all
layers honor write barriers properly. A battery backed cache that
survives a power outtage need not be disabled (and usually if you have
battery backed caching you don't want to since it has a considerable
performance impact).

To re-address your original post: Yes, given QUORUM @ RF=2 (meaning
that QUORUM is equivalent to ALL), any *successful* write is supposed
to be guaranteed to be visible by a subsequent read. In this case even
at CL.ONE since RF was 2 and QUORUM was equivalent to ALL.

If this is not what you're seeing, likely causes are either (a) a
problem with your test, (b) a cassandra bug, or (c) a kernel/hardware
misconfiguration or bug that causes fsync() to be broken with respect
to power outtages.

In order to eliminate (a), can you share the actual test? Even if (a)
looks good, you'd be surprised as to how often (c) can be the case.

If you are satisfied that the test is correct, one way to eliminate
Cassandra as a cause for the problem may be to restart your server by
a reset instead of cutting power, so that power supply never
disappears from your storage device. If you are no longer able to
reproduce the problem, it would indicate that fsync() is at least
causing I/O to reach a device (exit the operating system). If it still
fails, you're none the wiser.

If you're running without battery backed cache, or with battery backed
cache, one test you can do is run this (on a system which is otherwise
idle):

   http://distfiles.scode.org/mlref/fsynctime.py

The first argument is a filename which will be created/over-written.
It will then start printing the number of milliseconds each fsync()
takes. If you do not have battery backed caching, you should be seeing
numbers in the 5-25 ms range depending on circumstances. If you see
very low values, that indicates that fsync() is not working and the
writes are not forced to persistent storage.

(If battery backed caching exists, you will legitimiately get very low
values without it indicating anything is wrong.)


-- 
/ Peter Schuller

Mime
View raw message