hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlos Valiente (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1002) Secondary Name Node crash, NPE in edit log replay
Date Sat, 13 Mar 2010 15:30:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844903#action_12844903
] 

Carlos Valiente commented on HDFS-1002:
---------------------------------------

Any news on this issue? I'm seeing it as well on my cluster:

{noformat}
2010-03-13 15:19:48,830 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
succeeded for blk_-6710987566789746717_10247649
2010-03-13 15:19:50,443 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit
Log from XXX.XXX.XXX.XXX
2010-03-13 15:19:50,611 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Downloaded
file fsimage size 16959426 bytes.
2010-03-13 15:19:50,614 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Downloaded
file edits size 19474 bytes.
2010-03-13 15:19:50,626 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: defaultReplication
= 3
2010-03-13 15:19:50,626 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: maxReplication
= 512
2010-03-13 15:19:50,626 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: minReplication
= 1
2010-03-13 15:19:50,626 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: maxReplicationStreams
= 2
2010-03-13 15:19:50,626 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: shouldCheckForEnoughRacks
= true
2010-03-13 15:19:50,662 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop,hadoop,dialout,video
2010-03-13 15:19:50,662 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2010-03-13 15:19:50,662 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
2010-03-13 15:19:50,665 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isAccessTokenEnabled=false
accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
2010-03-13 15:19:50,694 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files
= 105773
2010-03-13 15:19:53,120 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files
under construction = 5
2010-03-13 15:19:53,125 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Throwable
Exception in doCheckpoint: 
2010-03-13 15:19:53,125 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.lang.NullPointerException
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1152)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1164)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1067)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:213)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadEditRecords(FSEditLog.java:511)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:401)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:368)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1172)
        at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:594)
        at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:476)
        at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:353)
        at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:317)
        at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:219)
        at java.lang.Thread.run(Thread.java:619)
{noformat}

I'm running 0.21.0-SNAPSHOT, revision 916530. I'm happy to provide the fsimage, edits and
edits.new files, it it helps.

C

> Secondary Name Node crash, NPE in edit log replay
> -------------------------------------------------
>
>                 Key: HDFS-1002
>                 URL: https://issues.apache.org/jira/browse/HDFS-1002
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.21.0
>            Reporter: ryan rawson
>             Fix For: 0.21.0
>
>         Attachments: snn_crash.tar.gz, snn_log.txt
>
>
> An NPE in SNN, the core of the message looks like yay so:
> 2010-02-25 11:54:05,834 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode:
java.lang.NullPointerException
>         at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1152)
>         at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1164)
>         at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1067)
>         at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:213)
>         at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadEditRecords(FSEditLog.java:511)
>         at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:401)
>         at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:368)
>         at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1172)
>         at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:594)
>         at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:476)
>         at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:353)
>         at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:317)
>         at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:219)
>         at java.lang.Thread.run(Thread.java:619)
> This happens even if I restart SNN over and over again.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message