hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1594) When the disk becomes full Namenode is getting shutdown and not able to recover
Date Mon, 14 Mar 2011 17:28:29 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006527#comment-13006527
] 

dhruba borthakur commented on HDFS-1594:
----------------------------------------

@Devaraj: ur proposal to make the NN go into safemode if the fsedits partition is almost full
sounds good.

@Todd: thanks for the info. If there is a possibility of fsedit corruption when the disk is
full, (irrespective of the patch suggested in this JIRA), then we need to fix it. This patch
tries to avoid this situation but is not foolproof. If the NN can (somehow) know that this
is the last partial transaction, then is can safely ignore it. maybe, when we pre-allocate
the editslog, we should fill it up with a specific pattern so that it is easy to detect if
the partial transaction is the last one? 

> When the disk becomes full Namenode is getting shutdown and not able to recover
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-1594
>                 URL: https://issues.apache.org/jira/browse/HDFS-1594
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.21.0, 0.21.1, 0.22.0
>         Environment: Linux linux124 2.6.27.19-5-default #1 SMP 2009-02-28 04:40:21 +0100
x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: Devaraj K
>             Fix For: 0.23.0
>
>         Attachments: HDFS-1594.patch, HDFS-1594.patch, HDFS-1594.patch, hadoop-root-namenode-linux124.log
>
>
> When the disk becomes full name node is shutting down and if we try to start after making
the space available It is not starting and throwing the below exception.
> {code:xml} 
> 2011-01-24 23:23:33,727 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
initialization failed.
> java.io.EOFException
> 	at java.io.DataInputStream.readFully(DataInputStream.java:180)
> 	at org.apache.hadoop.io.UTF8.readFields(UTF8.java:117)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.readString(FSImageSerialization.java:201)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:185)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:60)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1089)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:1041)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:487)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:149)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:306)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:284)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:328)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:356)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:577)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:570)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1529)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1538)
> 2011-01-24 23:23:33,729 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.EOFException
> 	at java.io.DataInputStream.readFully(DataInputStream.java:180)
> 	at org.apache.hadoop.io.UTF8.readFields(UTF8.java:117)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.readString(FSImageSerialization.java:201)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:185)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:60)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1089)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:1041)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:487)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:149)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:306)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:284)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:328)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:356)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:577)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:570)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1529)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1538)
> 2011-01-24 23:23:33,730 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:

> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at linux124/10.18.52.124
> ************************************************************/
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message