hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3521) Allow namenode to tolerate edit log corruption
Date Wed, 13 Jun 2012 00:37:42 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294028#comment-13294028

Colin Patrick McCabe commented on HDFS-3521:

In general, this is an extremely large patch that has at least a few different major components:
2. add a configuration option for "edits toleration length."
3. some generic code cleanups (add a LOG object to FSEditLog, re-arrange import directives
in FSEditLog to avoid import *, rename EditLogOutputStream#fill to EditLogOutputStream#PREALLOCATION_BUFFER)
4. disable edit log toleration for the SecondaryNameNode

I think any one of these could have been a separate patch.  Putting them all together doesn't
seem wise.  But let me go step-by-step:

*1. add in.mark(TRANSACTION_LENGTH_LIMIT = 4GB).*
What's the rationale behind this change?  It seems to break recovery mode, unless "edit log
toleration" is explicitly turned off.  Also, because of the way Java streams are implemented,
this potentially means that we could have up to a 4GB buffer.

This diverges from the way we handle a similar problem in upstream.  Check out the StreamLimiter
interface if you want to see how upstream handles a similar problem (but again, I'm not sure
what problem this is solving, so this is just a guess on my part.)

*2. add a configuration option for "edits toleration length."*

The UNCHECKED_REGION_LENGTH was carefully selected to be the same as the preallocation length.
 The issues is that if the preallocation length is 1MB, you know that you will never encounter
a partial write more than 1MB from the end of the file.

Reasoning: if there was a partial write, either:
1. It is in a region which you previously pre-allocated.  That means it's in the last 1MB
of the file, since we never allocate more than 1MB.
2. It is NOT in a region which you previously pre-allocated.  This means it almost certainly
is at the very end of the file.

Case #2 should almost never happen in practice, unless the NameNode writes an extremely huge
batch of edit log operations.  In both cases, the partial write should be in the last 1MB.
 That's the reason why the constant was chosen.  It is not something that the user can or
should configure.

*3. code cleanups*
Some of these are ok in principle, but they should be in a separate change.  Also, this patch
introduces a few misspellings.

*4. disable edit log toleration for the SecondaryNameNode*
I'm ok with this, but we don't need the rest of this patch.

This patch is seriously huge, combines unrelated things, breaks recovery mode, and diverges
from upstream.  I think we need to revisit this.
> Allow namenode to tolerate edit log corruption
> ----------------------------------------------
>                 Key: HDFS-3521
>                 URL: https://issues.apache.org/jira/browse/HDFS-3521
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 1.2.0
>         Attachments: h3521_20120610_b-1.patch, h3521_20120611_b-1.patch
> HDFS-3479 adds checking for edit log corruption. It uses a fixed UNCHECKED_REGION_LENGTH
(=PREALLOCATION_LENGTH) so that the bytes at the end within the length is not checked.  Instead
of not checking the bytes, we should check everything and allow toleration.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message