Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Sun, 2 Sep 2012 17:18:08 +1100 (NCT)
From: "Aaron T. Myers (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <794821162.28221.1346566688533.JavaMail.jiratomcat@arcas>
In-Reply-To: <911119017.26541.1346490247653.JavaMail.jiratomcat@arcas>
Subject: [jira] [Commented] (HDFS-3886) Shutdown requests can possibly check
 for checkpoint issues (corrupted edits) and save a good namespace copy
 before closing down?
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-3886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446883#comment-13446883 ] 

Aaron T. Myers commented on HDFS-3886:
--------------------------------------

Interesting idea. Perhaps we could add a "clean shutdown" dfsadmin command, and then add an extra action to the init.d script which a cautious admin can choose to run? That way we preserve the shutdown behavior that Steve is concerned about, but give the admin an option to have guaranteed-good metadata? Just thinking out loud.
                
> Shutdown requests can possibly check for checkpoint issues (corrupted edits) and save a good namespace copy before closing down?
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3886
>                 URL: https://issues.apache.org/jira/browse/HDFS-3886
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Priority: Minor
>
> HDFS-3878 sorta gives me this idea. Aside of having a method to download it to a different location, we can also lock up the namesystem (or deactivate the client rpc server) and save the namesystem before we complete up the shutdown.
> The init.d/shutdown scripts would have to work with this somehow though, to not kill -9 it when in-process. Also, the new image may be stored in a shutdown.chkpt directory, to not interfere in the regular dirs, but still allow easier recovery.
> Obviously this will still not work if all directories are broken. So maybe we could have some configs to tackle that as well?
> I haven't thought this through, so let me know what part is wrong to do :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira