hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3510) Improve FSEditLog pre-allocation
Date Thu, 21 Jun 2012 17:31:44 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398614#comment-13398614

Colin Patrick McCabe commented on HDFS-3510:

bq. In the current code after every flush the current buffer has OP_INVALID at the end, added
in EditLogFileOutputStream#setReadyToFlush. This is flushed to a file.

Right.  However, this patch changes EditLogFileOutputStream#setReadyToFlush so that it doesn't
do that any more.

What I was trying to explain above (sorry if it was unclear) was that this is no longer necessary.

As you know, an OP_INVALID operation is a single byte whose value is 0xff.

Back before HDFS-1846, we would have:
--- journal records --- 0xff 0x00 0x00 0x00 0x00 ... end-of-file
You can see that the single 0xff byte at the end is necessary because otherwise we'll see
zeros and try to interpret them as an OP_ADD (opcode 0).

However, after HDFS-1846 went in, we now have:
--- journal records --- 0xff 0xff 0xff 0xff 0xff ... end-of-file

There isn't any need to explicitly add another byte of 0xff because *all* of the padding bytes
are 0xff.  We already added all the OP_INVALIDs we needed to during pre-allocation; there's
no need to add another.
> Improve FSEditLog pre-allocation
> --------------------------------
>                 Key: HDFS-3510
>                 URL: https://issues.apache.org/jira/browse/HDFS-3510
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 1.0.0, 2.0.0-alpha
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>             Fix For: 1.0.0, 2.0.1-alpha
>         Attachments: HDFS-3510-b1.001.patch, HDFS-3510-b1.002.patch, HDFS-3510.001.patch,
HDFS-3510.003.patch, HDFS-3510.004.patch, HDFS-3510.004.patch, HDFS-3510.006.patch, HDFS-3510.007.patch,
HDFS-3510.008.patch, HDFS-3510.009.patch
> It is good to avoid running out of space in the middle of writing a batch of edits, because
when it happens, we often get partial edits at the end of the log.
> Edit log preallocation can solve this problem (see HADOOP-2330 for a full description
of edit log preallocation).
> The current pre-allocation code was introduced for performance reasons, not for preventing
partial edits.  As a consequence, we sometimes do a write without using pre-allocation.  We
should change the pre-allocation code so that it always preallocates at least enough space
before writing out the edits.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message