hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3004) Create Offline NameNode recovery tool
Date Tue, 28 Feb 2012 18:43:47 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218455#comment-13218455
] 

Colin Patrick McCabe commented on HDFS-3004:
--------------------------------------------

bq. Typcially Namenode is configured to store edits in multiple directories. The tool should
handle this. If one of the copies is corrupt and the other is not, it should indicate the
same.

Yeah, this is definitely something we should do-- and as Todd said, also in the normal NN
loading process as well.

bq. [What if] The editlog entry in the middle is corrupt, followed by clean entries (very
unlikely).

I think as a first pass solution, we would simply truncate the edit log at the point it becomes
unreadable and write out the image in its current form.  (This is assuming that the edit log
is corrupt in all copies.)  Later down the road, we could consider various heuristics for
this case, but as you said, it's unclear how likely it is that the rest of the log will be
readable.

In general, I think the recovery logic that we implement will depend on the patterns of corruption
we see in the wild.  A missing or corrupted last entry seems to be a very common one (hopefully
this is less common now that we reserve space before writing.)  If there's any other corruptions
you guys have seen, it would be really valuable to make a note here.
                
> Create Offline NameNode recovery tool
> -------------------------------------
>
>                 Key: HDFS-3004
>                 URL: https://issues.apache.org/jira/browse/HDFS-3004
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: tools
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-3004__namenode_recovery_tool.txt
>
>
> We've been talking about creating a tool which can process NameNode edit logs and image
files offline.
> This tool would be similar to a fsck for a conventional filesystem.  It would detect
inconsistencies and malformed data.  In cases where it was possible, and the operator asked
for it, it would try to correct the inconsistency.
> It's probably better to call this "nameNodeRecovery" or similar, rather than "fsck,"
since we already have a separate and unrelated mechanism which we refer to as fsck.
> The use case here is that the NameNode data is corrupt for some reason, and we want to
fix it.  Obviously, we would prefer never to get in this case.  In a perfect world, we never
would.  However, bad data on disk can happen from time to time, because of hardware errors
or misconfigurations.  In the past we have had to correct it manually, which is time-consuming
and which can result in downtime.
> I would like to reuse as much code as possible from the NameNode in this tool.  Hopefully,
the effort that is spent developing this will also make the NameNode editLog and image processing
even more robust than it already is.
> Another approach that we have discussed is NOT having an offline tool, but just having
a switch supplied to the NameNode, like "—auto-fix" or "—force-fix".  In that case, the
NameNode would attempt to "guess" when data was missing or incomplete in the EditLog or Image--
rather than aborting as it does now.  Like the proposed fsck tool, this switch could be used
to get users back on their feet quickly after a problem developed.  I am not in favor of this
approach, because there is a danger that users could supply this flag in cases where it is
not appropriate.  This risk does not exist for an offline fsck tool, since it would have to
be run explicitly.  However, I wanted to mention this proposal here for completeness.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message