hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Mitic (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-4677) Editlog should support synchronous writes
Date Tue, 09 Apr 2013 06:58:18 GMT
Ivan Mitic created HDFS-4677:

             Summary: Editlog should support synchronous writes
                 Key: HDFS-4677
                 URL: https://issues.apache.org/jira/browse/HDFS-4677
             Project: Hadoop HDFS
          Issue Type: Bug
    Affects Versions: 3.0.0, 1-win
            Reporter: Ivan Mitic
            Assignee: Ivan Mitic

In the current implementation, NameNode editlog performs syncs to the persistent storage using
the {{FileChannel#force}} Java APIs. This API is documented to be slower compared to an alternative
where {{RandomAccessFile}} is opened with "rws" flags (synchronous writes). 

We instrumented {{FileChannel#force}} on Windows and it some software/hardware configurations
it can perform significantly slower than the “rws” alternative.

In terms of the Windows APIs, FileChannel#force internally calls [FlushFileBuffers|http://msdn.microsoft.com/en-us/library/windows/desktop/aa364439(v=vs.85).aspx]
while RandomAccessFile (“rws”) opens the file with the [FILE_FLAG_WRITE_THROUGH flag|http://support.microsoft.com/kb/99794].

With this Jira I'd like to introduce a flag that provide means to configure NameNode to use
synchronous writes. There is a catch though, the behavior of the "rws" flags is platform and
hardware specific and might not provide the same level of guarantees as {{FileChannel#force}}
w.r.t. flushing the on-disk cache. This is an expert level setting, and it should be documented
as such.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message