hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12420) Disable Namenode format when data already exists
Date Thu, 14 Sep 2017 00:19:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165515#comment-16165515

Allen Wittenauer commented on HDFS-12420:

bq. current format functionality is broken itself. It deletes the metadata while doing nothing
about the data stored in data-nodes. 

Just like mkfs.  And just like it, the fact that it doesn't delete the actual data is a feature,
not a bug.  If I restore the fsimage back then my data should come back too.  (mostly... new
data ofc is likely to be missing, etc) It's why making a copy of the fsimage is Hadoop Ops

Some key advice I give to admins:  you can try to prevent mistakes, but they'll still happen
despite your best efforts.  After low hanging warnings, the energy is better spent on how
to quickly recover. But that's a problem that's outside of the core code.

For the record, yes, I've made HUGE mistakes like this in my career.  Every admin has. In
my case, I brought down an entire hospital once.  Even with that experience, I still think
requiring metadata deletion outside of the tool set is waaaaay overkill.

bq. may be being able to tag a cluster as "production" like discussed above is a better idea?

Yeah, sure, whatever.  All that's going to happen is:

hdfs --config /tmp/mymodifiedconfig namenode -format -force

If a user is too lazy/impatient/distracted to check that they are on a live system before
hitting y, they'll just change the flag and then format.  But if that makes folks happy, fine.
 It still sounds like the console output needs some work though if a user couldn't "see" it.
 (Not sure I agree with that either, but whatever.)

BTW, a quick search for how the equivalent problem is solved in databases is interesting.
Almost all of them that I looked at: don't give the user access. So yes, enough rope to hang
themselves seems to be the expectation operationally.

> Disable Namenode format when data already exists
> ------------------------------------------------
>                 Key: HDFS-12420
>                 URL: https://issues.apache.org/jira/browse/HDFS-12420
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Ajay Kumar
>            Assignee: Ajay Kumar
>         Attachments: HDFS-12420.01.patch, HDFS-12420.02.patch
> Disable NameNode format to avoid accidental formatting of Namenode in production cluster.
If someone really wants to delete the complete fsImage, they can first delete the metadata
dir and then run {code} hdfs namenode -format{code} manually.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message