hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1002) Secondary Name Node crash, NPE in edit log replay
Date Thu, 01 Apr 2010 20:32:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852535#action_12852535
] 

Hairong Kuang commented on HDFS-1002:
-------------------------------------

I took a look at the image & edits that Carlos provided at HDFS-686. It clearly indicated
some edit entries were missing. The missing parent directory /fields/0001/20100325_1200/c1b1301_wrep_o_12_pp_fc_tp
is not in the image and no other entry in edits contains this directory.

In this case, although addChildNPE.patch avoids the crash, it does not help get the missing
directory back.

> Secondary Name Node crash, NPE in edit log replay
> -------------------------------------------------
>
>                 Key: HDFS-1002
>                 URL: https://issues.apache.org/jira/browse/HDFS-1002
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.21.0
>            Reporter: ryan rawson
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: addChildNPE.patch, snn_crash.tar.gz, snn_log.txt
>
>
> An NPE in SNN, the core of the message looks like yay so:
> 2010-02-25 11:54:05,834 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode:
java.lang.NullPointerException
>         at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1152)
>         at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1164)
>         at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1067)
>         at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:213)
>         at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadEditRecords(FSEditLog.java:511)
>         at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:401)
>         at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:368)
>         at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1172)
>         at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:594)
>         at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:476)
>         at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:353)
>         at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:317)
>         at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:219)
>         at java.lang.Thread.run(Thread.java:619)
> This happens even if I restart SNN over and over again.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message