hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2759) Pre-allocate HDFS edit log files after writing version number
Date Thu, 26 Jan 2012 00:22:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193466#comment-13193466

Aaron T. Myers commented on HDFS-2759:

Sorry, I should have mentioned what I did for testing. No tests are included since writing
a test seems like it would require adding a test hook in the EditLogFileOutputStream#flushAndSync
method, which seems undesirable since that method is highly performance-sensitive. I verified
that the bug exists by adding a "System.exit(0)" after pre-allocation, but before calling
FileChannel#force. This indeed resulted in a file full of FF, which if an NN were to try to
read would be interpreted as an invalid header version number. By moving the pre-allocation
after the call to FileChannel#force, we guarantee that the valid data will hit the edit log
before writing FFs for the purpose of pre-allocation.

In the course of preparing and testing this patch, I think I've discovered another potential
issue, though. Note that in the call to FileChannel#force, we pass "false" which indicates
that FileChannel#force does not need to wait for metadata of the file to be synced to disk,
only the data. I'm not sure of the precise semantics of not syncing metadata. In particular,
is the file length included? If not, I think this has the potential to cause some edits to
not actually be read back from disk after a crash.

The explanation, per the comment, is that syncing metadata is unnecessary because of pre-allocation.
I don't think that's reasonable, though, since EditLogFileOutputStream#preallocate doesn't
call sync itself, which means that the file length might never get updated upon returning
from EditLogFileOutputStream#flushAndSync.

Do people agree this is a potential problem? If so, I can either fix it here, or file a new
> Pre-allocate HDFS edit log files after writing version number
> -------------------------------------------------------------
>                 Key: HDFS-2759
>                 URL: https://issues.apache.org/jira/browse/HDFS-2759
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha, name-node
>    Affects Versions: 0.24.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: HDFS-2759.patch
> In HDFS-2709 it was discovered that there's a potential race wherein edits log files
are pre-allocated before the version number is written into the header of the file. This can
cause the NameNode to read an invalid HDFS layout version, and hence fail to read the edit
log file. We should write the header, then pre-allocate the rest of the file after this point.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message