cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhu Han (JIRA)" <>
Subject [jira] [Created] (CASSANDRA-3248) CommitLog writer should call fdatasync instead of fsync
Date Fri, 23 Sep 2011 12:40:26 GMT
CommitLog writer should call fdatasync instead of fsync

                 Key: CASSANDRA-3248
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
    Affects Versions: 0.8.6, 1.0.0, 1.1
         Environment: Linux
            Reporter: Zhu Han

CommitLogSegment use SequentialWriter to flush the buffered data to log device. It depends
on FileDescriptor#sync() which invokes fsync() as it force the file attributes to disk.

However, at least on Linux, fdatasync() is good enough for commit log flush:

bq. fdatasync() is similar to fsync(), but does not flush modified metadata unless that metadata
is needed in order to allow a subsequent data retrieval to be  correctly handled.  For example,
changes to st_atime or st_mtime (respectively, time of last access and time of last modification;
see stat(2)) do not require flushing because they are not necessary for a subsequent data
read to be handled correctly.  On the other hand, a change to the file size (st_size,  as
 made  by  say  ftruncate(2)),  would require a metadata flush.

File size is synced to disk by fdatasync() either. Although the commit log recovery logic
sorts the commit log segements on their modify timestamp, it can be removed safely, IMHO.

I checked the native code of JRE 6. On Linux and Solaris, FileChannel#force(false) invokes
fdatasync(). On windows, the false flag does not have any impact.

On my log device (commodity SATA HDD, write cache disabled), fsync() and fdatasync() has large
performance gap:
$sysbench --test=fileio --num-threads=1  --file-num=1 --file-total-size=10G --file-fsync-all=on
--file-fsync-mode=fdatasync --file-test-mode=seqwr --max-time=600 --file-block-size=2K  --max-requests=0

54.90 Requests/sec executed
   per-request statistics:
         min:                                  8.29ms
         avg:                                 18.18ms
         max:                                108.36ms
         approx.  95 percentile:              25.02ms

$ sysbench --test=fileio --num-threads=1  --file-num=1 --file-total-size=10G --file-fsync-all=on
--file-fsync-mode=fsync --file-test-mode=seqwr --max-time=600 --file-block-size=2K  --max-requests=0

28.08 Requests/sec executed

    per-request statistics:
         min:                                 33.28ms
         avg:                                 35.61ms
         max:                                911.87ms
         approx.  95 percentile:              41.69ms

I do think this is a very critical performance improvement.

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message