hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1594) When the disk becomes full Namenode is getting shutdown and not able to recover
Date Mon, 14 Mar 2011 17:02:29 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006515#comment-13006515
] 

Todd Lipcon commented on HDFS-1594:
-----------------------------------

Dhruba: I haven't dug into it yet, but I've also seen this before when the drive hosting the
log4j logs fills up. I think there are some cases in which we do a System.exit(1) which can
corrupt the logs -- the issue being that, even though we fsync after every edit, there's no
guarantee that we don't get half of the preceding edit. Hairong's proposal to checksum each
edit should help here.

> When the disk becomes full Namenode is getting shutdown and not able to recover
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-1594
>                 URL: https://issues.apache.org/jira/browse/HDFS-1594
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.21.0, 0.21.1, 0.22.0
>         Environment: Linux linux124 2.6.27.19-5-default #1 SMP 2009-02-28 04:40:21 +0100
x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: Devaraj K
>             Fix For: 0.23.0
>
>         Attachments: HDFS-1594.patch, HDFS-1594.patch, HDFS-1594.patch, hadoop-root-namenode-linux124.log
>
>
> When the disk becomes full name node is shutting down and if we try to start after making
the space available It is not starting and throwing the below exception.
> {code:xml} 
> 2011-01-24 23:23:33,727 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
initialization failed.
> java.io.EOFException
> 	at java.io.DataInputStream.readFully(DataInputStream.java:180)
> 	at org.apache.hadoop.io.UTF8.readFields(UTF8.java:117)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.readString(FSImageSerialization.java:201)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:185)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:60)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1089)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:1041)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:487)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:149)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:306)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:284)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:328)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:356)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:577)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:570)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1529)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1538)
> 2011-01-24 23:23:33,729 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.EOFException
> 	at java.io.DataInputStream.readFully(DataInputStream.java:180)
> 	at org.apache.hadoop.io.UTF8.readFields(UTF8.java:117)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.readString(FSImageSerialization.java:201)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:185)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93)
> 	at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:60)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1089)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:1041)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:487)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:149)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:306)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:284)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:328)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:356)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:577)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:570)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1529)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1538)
> 2011-01-24 23:23:33,730 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:

> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at linux124/10.18.52.124
> ************************************************************/
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message