hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Kelly (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1725) Cleanup FSImage construction
Date Sat, 12 Mar 2011 14:14:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006045#comment-13006045

Ivan Kelly commented on HDFS-1725:

1. setStorageDirectories also calls removedStorageDirs.clear() and re-initializes storage
directories. This patch removes the calls to setStorageDirectories from a few places, therefore
those StorageDirectories that were removed due to some error might continue to hang in removedStorageDirs
and won't be reinstated. Will attemptRestoreRemovedStorage take care of clearing up of removedStorageDirs
in all cases?
attemptRestoreRemovedStorage is called before attempting to save the namespace, so it will
be called in all cases on the primary node. setStorageDirectories was only ever called in
initialisation for the primary node anyhow, so it wouldn't have restored anything.

Secondary and Backup node are a different story. They have recoverCreate etc, which run periodically,
and used to call setStorageDirectories. The effect of this was two fold. a) to unlock the
directories for analysis & b) to restore failed storage directories. It was never to actually
change the storage directories, as it never actually does this. Now it explicitly unlocks
and attempts the restore (added in latest patch).
>+ for (URI uri : editDirsToFormat) {
>+ if (!dirsToFormat.contains(uri)) { >+ dirsToFormat.add(uri); >+ }
>+ }
This means currently we don't ask for confirmation before formatting edit directories that
are not namespace directories. I am wondering do we need to change that, although it seems
to be ok.
I think it must have been an oversight at some stage, where separate directories for images
and edits where introduced. EditLogs are just as important as images, so its best to confirm
if we're going to delete them.

4. In SecondaryNameNode.java#startCheckpoint.
What is the reason behind removing the call to unlockAll?

In recoverCreate a call to unlockAll is added and storage.close is removed. storage.close
was also calling listeners.clear, which will not be called now. Is that ok?
See response to 1. Regarding listeners, I don't think it will make any difference. The listener
is only there to allow NNStorage inform the objects using it that an error has occurred. If
a directory does cause an error, removing it from use is the correct thing to do.

> Cleanup FSImage construction
> ----------------------------
>                 Key: HDFS-1725
>                 URL: https://issues.apache.org/jira/browse/HDFS-1725
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>             Fix For: 0.23.0
>         Attachments: HDFS-1725.diff, HDFS-1725.diff, HDFS-1725.diff
> FSImage construction is messy. Sometimes the storagedirectories in use are set straight
away, sometimes they are not. This makes it hard for anything under FSImage (i.e. FSEditLog)
to make assumptions about what it can use. Therefore, this patch makes FSImage set the storage
directories in use during construction, and never allows them to change. If you want to change
storagedirectories you create a new image.
> Also, all the construction code should be the same with the only difference being the
parameters passed. When not passed, these should get sensible defaults.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message