hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-3510) FSEditLog pre-allocation does not work in branch-1
Date Tue, 05 Jun 2012 23:38:23 GMT

     [ https://issues.apache.org/jira/browse/HDFS-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Colin Patrick McCabe updated HDFS-3510:

    Attachment: HDFS-3510-b1.002.patch

* fill with all 0xff, so that we don't have to handle OP_INVALID specially.

Also, technically it is undefined what byte pattern ByteBuffer.allocateDirect fills the buffer
with, although it is 0 in practice in Oracle implementations.  In JDK7 they specified that
it zero-fills.
> FSEditLog pre-allocation does not work in branch-1
> --------------------------------------------------
>                 Key: HDFS-3510
>                 URL: https://issues.apache.org/jira/browse/HDFS-3510
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>             Fix For: 1.0.0
>         Attachments: HDFS-3510-b1.001.patch, HDFS-3510-b1.002.patch
> In the FSEditLog, we want to avoid running out of space in the middle of writing an edit
log operation to the disk.  We do this by a process called "preallocation"-- reserving space
on the disk for the upcoming edit log entries before beginning to write them.
> branch-1 has some major problems with the way it does preallocation.  These problems
can lead to corrupt edit logs when the disk runs out of space during an edit log sync operation.
> The problems are:
> * We use FileChannel#write without checking for short writes, but WritableByteChannel
explicitly documents that they are possible, and the FileChannel subclass is silent on the
> * We only try to do preallocation when the current position is less than 4096 bytes from
the end of the file.  However, bufReady starts out at 512kb, and only gets bigger from there.
 There is no way that 4kb is enough space to reserve.
> * The current code seems to be based on a misunderstanding of how space is allocated
in files in Linux.  In FileChannel#write(ByteBuffer, long), the second argument is the offset
to start writing at.  Since we set this to fc.position() + 1024*1024, this means that we *start*
writing a megabyte after the end of the file.  This is guaranteed to create a sparse file
on Linux, defeating the point of pre-allocation.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message