Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Tue, 19 Apr 2011 00:00:09 +0000 (UTC)
From: "Aaron T. Myers (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: 
 <1923801705.65979.1303171209056.JavaMail.tomcat@hel.zones.apache.org>
In-Reply-To: <28557936.153581295876025958.JavaMail.jira@thor>
Subject: [jira] [Commented] (HDFS-1594) When the disk becomes full Namenode
 is getting shutdown and not able to recover
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021347#comment-13021347 ] 

Aaron T. Myers commented on HDFS-1594:
--------------------------------------

bq. I agree, should be pulled out to a separate jira.

Removed.

bq. Good idea, there's nothing edits specific here. Would need to add a test that if the admin does pass in the volume that hosts the edits log it doesn't conflict with the default behavior (eg double monitoring).

Done. I used a {{HashMap}} indexed by volume, and added tests to make sure we only check a single volume at most once.

bq. What's the intended behavior if there are n disks and one fills up before the others? Seems like this volume should be taken off-line and the NN does not enter SM.

I disagree. I think losing one of the (supposedly redundant) volumes is sufficient cause for alarm as to warrant the whole thing being put into SM.

bq. If there's just a global threshold would this cause the overall threshold to drop (because the removed volume's free space not longer counts towards the total), causing a cascade where the other volumes go off-line? This would suggest a threshold per volume. Though if we can make a single, simple threshold work that seems better from a usability perspective.

I should have been more clear. The current implementation is indeed that there is a threshold per volume, it's just the same for all volumes. I was not trying to distinguish between a single total threshold vs. per-volume thresholds. Rather, the question I was trying to ask is "should the user be able to specify distinct thresholds per volume? e.g. 100MB on /disk/1 and 1GB on /disk/2".

I'm in favor of the current implementation - a single configurable threshold, which applies to each volume separately.

bq. In both cases I think the admin would want to have to manually tell the NN to leave SM while they are working (eg w/o them explicitly telling it to do so). If they want automatic behavior they can continuosly monitor/roll on these volumes so they don't get into this scenario, and they don't want the monitoring/rolling to race with the free space detection (eg they'd want to have to take action if this process ever crosses the threshold they set). Ie seems like once you've gone into SM due to lack of free space you should stay there until the admin has had a chance to rectify.

Agreed. Done.

bq. Another test to add: the interaction between this detection and the CN check-pointing.

The current patch does not contain such a test. I'm thinking about the best way to implement this.

> When the disk becomes full Namenode is getting shutdown and not able to recover
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-1594
>                 URL: https://issues.apache.org/jira/browse/HDFS-1594
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.21.0, 0.21.1, 0.22.0
>         Environment: Linux linux124 2.6.27.19-5-default #1 SMP 2009-02-28 04:40:21 +0100 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: Devaraj K
>            Assignee: Aaron T. Myers
>             Fix For: 0.23.0
>
>         Attachments: HDFS-1594.patch, HDFS-1594.patch, HDFS-1594.patch, hadoop-root-namenode-linux124.log, hdfs-1594.0.patch, hdfs-1594.1.patch, hdfs-1594.2.patch, hdfs-1594.3.patch
>
>
> When the disk becomes full name node is shutting down and if we try to start after making the space available It is not starting and throwing the below exception.
> {code:xml} 
> 2011-01-24 23:23:33,727 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
> java.io.EOFException
> 	at java.io.DataInputStream.readFully(DataInputStream.java:180)
> 	at org.apache.hadoop.io.UTF8.readFields(UTF8.java:117)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.readString(FSImageSerialization.java:201)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:185)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:60)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1089)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:1041)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:487)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:149)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:306)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:284)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:328)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:356)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:577)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:570)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1529)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1538)
> 2011-01-24 23:23:33,729 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.EOFException
> 	at java.io.DataInputStream.readFully(DataInputStream.java:180)
> 	at org.apache.hadoop.io.UTF8.readFields(UTF8.java:117)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.readString(FSImageSerialization.java:201)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:185)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:60)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1089)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:1041)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:487)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:149)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:306)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:284)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:328)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:356)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:577)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:570)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1529)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1538)
> 2011-01-24 23:23:33,730 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: 
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at linux124/10.18.52.124
> ************************************************************/
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira