hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7991) Allow users to skip checkpoint when stopping NameNode
Date Tue, 31 Mar 2015 18:50:54 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14389178#comment-14389178

Jing Zhao commented on HDFS-7991:

bq. This is easily fixed by just increasing the timeout or adding logic other logic such as
asking if the NN is still alive, etc.

But it's hard to know if the NN is still doing checkpoint or NN is stuck in somewhere else.
Also it is hard to get a deterministic bound for the timeout value.

bq. The problem is that HADOOP_OPTS has the NN's configuration inside it. So, for example,
if a user sets the heap size to 64g

Good catch. I will try to fix this in a later patch.

bq. The code absolutely must shell out another bin/hdfs process to get the proper HADOOP_OPTS
setting. I suspect it will actually have to use a subshell plus parameter captures so that
the environment is clean due to various export statements throughout the code and in a lot
of user's *-env.sh files.

One question here is: can we just simply capture the value of {{HADOOP_OPTS}} before appending
{{HADOOP_NAMENODE_OPTS}} to it, and use the captured value for this checkpoint? Looks like
this way equals to using a dfsadmin command in the NN's machine.

> Allow users to skip checkpoint when stopping NameNode
> -----------------------------------------------------
>                 Key: HDFS-7991
>                 URL: https://issues.apache.org/jira/browse/HDFS-7991
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>         Attachments: HDFS-7991.000.patch, HDFS-7991.001.patch, HDFS-7991.002.patch, HDFS-7991.003.patch
> This is a follow-up jira of HDFS-6353. HDFS-6353 adds the functionality to check if saving
namespace is necessary before stopping namenode. As [~kihwal] pointed out in this [comment|https://issues.apache.org/jira/browse/HDFS-6353?focusedCommentId=14380898&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14380898],
in a secured cluster this new functionality requires the user to be kinit'ed.

This message was sent by Atlassian JIRA

View raw message