hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uma Maheswara Rao G (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2838) NPE in FSNamesystem when in safe mode
Date Fri, 27 Jan 2012 02:03:40 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194359#comment-13194359
] 

Uma Maheswara Rao G commented on HDFS-2838:
-------------------------------------------

Thanks Greg,
Eli, is this test failing reliably for you without fix? For me, it passes even with out fix.
It may be ok to keep this test, at least this can reproduce randomly. may be better than nothing:-)
@Greg, small suggestion, from next time you can use HdfsConstants instead of FSConstants.

                
> NPE in FSNamesystem when in safe mode
> -------------------------------------
>
>                 Key: HDFS-2838
>                 URL: https://issues.apache.org/jira/browse/HDFS-2838
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Gregory Chanan
>            Assignee: Gregory Chanan
>         Attachments: HDFS-2838-v2.patch, HDFS-2838.patch
>
>
> I'm seeing an NPE when running HBase 0.92 unit tests against the HA branch.  The test
failure is: org.apache.hadoop.hbase.regionserver.wal.TestHLog.testAppendClose.
> Here is the backtrace:
> java.lang.NullPointerException
> 	at org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.size(BlocksMap.java:179)
> 	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getActiveBlockCount(BlockManager.java:2465)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.doConsistencyCheck(FSNamesystem.java:3591)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.isOn(FSNamesystem.java:3285)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.access$900(FSNamesystem.java:3196)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.isInSafeMode(FSNamesystem.java:3670)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.isInSafeMode(NameNode.java:609)
> 	at org.apache.hadoop.hdfs.MiniDFSCluster.isNameNodeUp(MiniDFSCluster.java:1476)
> 	at org.apache.hadoop.hdfs.MiniDFSCluster.isClusterUp(MiniDFSCluster.java:1487)
> Here is the relevant section of the test:
> {code}
>    try {
>       DistributedFileSystem dfs = (DistributedFileSystem) cluster.getFileSystem();
>       dfs.setSafeMode(FSConstants.SafeModeAction.SAFEMODE_ENTER);
>       cluster.shutdown();
>       try {
>         // wal.writer.close() will throw an exception,
>         // but still call this since it closes the LogSyncer thread first
>         wal.close();
>       } catch (IOException e) {
>         LOG.info(e);
>       }
>       fs.close(); // closing FS last so DFSOutputStream can't call close
>       LOG.info("STOPPED first instance of the cluster");
>     } finally {
>       // Restart the cluster
>       while (cluster.isClusterUp()){
>         LOG.error("Waiting for cluster to go down");
>         Thread.sleep(1000);
>       }
> {code}
> Fix looks trivial, will include patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message