hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1690) Name Node should not process format command while it is running.
Date Wed, 29 Jun 2011 01:32:28 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056940#comment-13056940

Konstantin Shvachko commented on HDFS-1690:

I think we can just skip checking the presence of in_use.lock during formatting and go directly
to setting the lock on that file? The effect will be the same. If file does not exist, it
will create and lock it.

I think the problem with formatting is in {{StorageDirectory.clearDirectory()}}. It calls
{{FileUtil.fullyDelete()}} if {{curDir}} exists, which deletes all files in the directory,
but the order of deletion is not deterministic. This probably causes the partial formatting
you mentioned before. If {{format()}} first deletes in_use.lock it will fail correctly if
another NN is still running. If it first delete fsimage then in_use.lock, then it will also
fail, but will leave the state unrecoverable.

So I'd propose to modify {{StorageDirectory.clearDirectory()}} to first delete file {{STORAGE_FILE_LOCK=in_use.lock}}
then the rest of it. I see now it's a bug.

> Name Node should not process format command while it is running.
> ----------------------------------------------------------------
>                 Key: HDFS-1690
>                 URL: https://issues.apache.org/jira/browse/HDFS-1690
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.20.1, 0.20.2, 0.21.0
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>         Attachments: HDFS-1690.patch
> Currently NameNode allows format command while it running. In this case the command is
executed partially (Lock file is deleted) and an exception thrown. Because of this Name Node
should be formatted after restart. This sort of cases can happen accidentally. To prevent
such cases Name Node should not execute the format command partially while it is running.
It can stright away throw exception/log saying, Name Node is running.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message