hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4677) Editlog should support synchronous writes
Date Wed, 22 May 2013 16:11:23 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664217#comment-13664217

Chris Nauroth commented on HDFS-4677:

+1 for both trunk and branch-1-win patches.  Thank you for tracking down this performance
issue and contributing a fix!

Quick question, should this go to branch-2 as well, given that there was a bit of refactoring
going on? This would make things easier for future backports. I just checked, and the patch
is almost completely compatible.

It would be possible to get this into branch-2, considering that the patch doesn't really
have any Windows-specific code.  If we put it in branch-2 right now, then it would be less
code to merge later.

It looks like we'd have to resolve a conflict in {{TestNNStorageRetentionManager}}.  I'd suggest
that if it's easy enough to resolve that conflict now, then we might as well go for it.
> Editlog should support synchronous writes
> -----------------------------------------
>                 Key: HDFS-4677
>                 URL: https://issues.apache.org/jira/browse/HDFS-4677
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 1-win
>            Reporter: Ivan Mitic
>            Assignee: Ivan Mitic
>         Attachments: HDFS-4677.2.patch, HDFS-4677.3.patch, HDFS-4677.branch-1-win.patch,
> In the current implementation, NameNode editlog performs syncs to the persistent storage
using the {{FileChannel#force}} Java APIs. This API is documented to be slower compared to
an alternative where {{RandomAccessFile}} is opened with "rws" flags (synchronous writes).

> We instrumented {{FileChannel#force}} on Windows and it some software/hardware configurations
it can perform significantly slower than the “rws” alternative.
> In terms of the Windows APIs, FileChannel#force internally calls [FlushFileBuffers|http://msdn.microsoft.com/en-us/library/windows/desktop/aa364439(v=vs.85).aspx]
while RandomAccessFile (“rws”) opens the file with the [FILE_FLAG_WRITE_THROUGH flag|http://support.microsoft.com/kb/99794].

> With this Jira I'd like to introduce a flag that provide means to configure NameNode
to use synchronous writes. There is a catch though, the behavior of the "rws" flags is platform
and hardware specific and might not provide the same level of guarantees as {{FileChannel#force}}
w.r.t. flushing the on-disk cache. This is an expert level setting, and it should be documented
as such.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message