hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Falk <pe...@bugsoft.nu>
Subject Please help! Corrupt fsimage?
Date Wed, 07 Jul 2010 12:46:14 GMT
Hi,

After a restart of our live cluster today, the name node fails to start with
the log message seen below. There is a big file called edits.new in the
"current" folder that seems be the only one that have received changes
recently (no changes to the edits or the fsimage for over a month). Is that
normal?

The last change to the edits.new file was right before shutting down the
cluster. It seems like the shutdown was unable to store valid fsimage,
edits, edits.new files. The secondary name node image does not include the
edits.new file, only edits and fsimage, which are identical to the name
nodes version. So no help from them.

Would appreciate any help in understanding what could have gone wrong. The
shutdown seemed to complete just fine, without any error message. Is there
any way to recreate the image from the data, or any other way to save our
production data?

Sincerely,
Peter

2010-07-07 14:30:26,949 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
Initializing RPC Metrics with hostName=NameNode, port=9000
2010-07-07 14:30:26,960 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=NameNode, sessionId=null
2010-07-07 14:30:27,019 DEBUG
org.apache.hadoop.security.UserGroupInformation: Unix Login: hbase,hbase
2010-07-07 14:30:27,149 ERROR
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
initialization failed.
java.io.EOFException
        at java.io.DataInputStream.readShort(DataInputStream.java:298)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:881)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:807)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
2010-07-07 14:30:27,150 INFO org.apache.hadoop.ipc.Server: Stopping server
on 9000
2010-07-07 14:30:27,151 ERROR
org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.EOFException
        at java.io.DataInputStream.readShort(DataInputStream.java:298)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:881)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:807)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message