hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Falk <pe...@bugsoft.nu>
Subject Re: Please help! Corrupt fsimage?
Date Wed, 07 Jul 2010 14:34:39 GMT
Just a little update. We found a working fsimage that was just a couple of
days older than the corrupt one. We tried to replace the fsimage with the
working one, and kept the edits and edits.new files, hoping the the latest
edits would be still in use. However, when starting the namenode, the
following error message appears. Any thought ideas or hints of how to
continue? Edit the edits files somehow?

TIA,
Peter

2010-07-07 16:21:10,312 INFO org.apache.hadoop.hdfs.server.common.Storage:
Number of files = 28372
2010-07-07 16:21:11,162 INFO org.apache.hadoop.hdfs.server.common.Storage:
Number of files under construction = 8
2010-07-07 16:21:11,164 INFO org.apache.hadoop.hdfs.server.common.Storage:
Image file of size 3315887 loaded in 0 seconds.
2010-07-07 16:21:11,164 DEBUG
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 9:
/hbase/.logs/miller,60020,1274447474064/hlog.dat.1274706452423 numblocks : 1
clientHolder  clientMachine
2010-07-07 16:21:11,164 DEBUG org.apache.hadoop.hdfs.StateChange: DIR*
FSDirectory.unprotectedDelete: failed to remove
/hbase/.logs/miller,60020,1274447474064/hlog.dat.1274706452423 because it
does not exist
2010-07-07 16:21:11,164 ERROR
org.apache.hadoop.hdfs.server.namenode.NameNode:
java.lang.NullPointerException
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1006)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:982)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:194)
        at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:615)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:992)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)

2010-07-07 16:21:11,165 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at fanta/192.168.10.53
************************************************************/


On Wed, Jul 7, 2010 at 14:46, Peter Falk <peter@bugsoft.nu> wrote:

> Hi,
>
> After a restart of our live cluster today, the name node fails to start
> with the log message seen below. There is a big file called edits.new in the
> "current" folder that seems be the only one that have received changes
> recently (no changes to the edits or the fsimage for over a month). Is that
> normal?
>
> The last change to the edits.new file was right before shutting down the
> cluster. It seems like the shutdown was unable to store valid fsimage,
> edits, edits.new files. The secondary name node image does not include the
> edits.new file, only edits and fsimage, which are identical to the name
> nodes version. So no help from them.
>
> Would appreciate any help in understanding what could have gone wrong. The
> shutdown seemed to complete just fine, without any error message. Is there
> any way to recreate the image from the data, or any other way to save our
> production data?
>
> Sincerely,
> Peter
>
> 2010-07-07 14:30:26,949 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
> Initializing RPC Metrics with hostName=NameNode, port=9000
> 2010-07-07 14:30:26,960 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=NameNode, sessionId=null
> 2010-07-07 14:30:27,019 DEBUG
> org.apache.hadoop.security.UserGroupInformation: Unix Login: hbase,hbase
> 2010-07-07 14:30:27,149 ERROR
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
> initialization failed.
> java.io.EOFException
>         at java.io.DataInputStream.readShort(DataInputStream.java:298)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:881)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:807)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
> 2010-07-07 14:30:27,150 INFO org.apache.hadoop.ipc.Server: Stopping server
> on 9000
> 2010-07-07 14:30:27,151 ERROR
> org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.EOFException
>         at java.io.DataInputStream.readShort(DataInputStream.java:298)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:881)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:807)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message