hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uma Maheswara Rao G (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2838) NPE in FSNamesystem when in safe mode
Date Wed, 25 Jan 2012 03:52:40 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192846#comment-13192846
] 

Uma Maheswara Rao G commented on HDFS-2838:
-------------------------------------------

+1

I just verified his sample test code. It passes for me. Yes, it would be tricky to create
the situation where safemode object is not null and blockmanager not up completely. Thanks
Greg for the patch.

                
> NPE in FSNamesystem when in safe mode
> -------------------------------------
>
>                 Key: HDFS-2838
>                 URL: https://issues.apache.org/jira/browse/HDFS-2838
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Gregory Chanan
>            Assignee: Gregory Chanan
>         Attachments: HDFS-2838.patch
>
>
> I'm seeing an NPE when running HBase 0.92 unit tests against the HA branch.  The test
failure is: org.apache.hadoop.hbase.regionserver.wal.TestHLog.testAppendClose.
> Here is the backtrace:
> java.lang.NullPointerException
> 	at org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.size(BlocksMap.java:179)
> 	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getActiveBlockCount(BlockManager.java:2465)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.doConsistencyCheck(FSNamesystem.java:3591)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.isOn(FSNamesystem.java:3285)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.access$900(FSNamesystem.java:3196)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.isInSafeMode(FSNamesystem.java:3670)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.isInSafeMode(NameNode.java:609)
> 	at org.apache.hadoop.hdfs.MiniDFSCluster.isNameNodeUp(MiniDFSCluster.java:1476)
> 	at org.apache.hadoop.hdfs.MiniDFSCluster.isClusterUp(MiniDFSCluster.java:1487)
> Here is the relevant section of the test:
> {code}
>    try {
>       DistributedFileSystem dfs = (DistributedFileSystem) cluster.getFileSystem();
>       dfs.setSafeMode(FSConstants.SafeModeAction.SAFEMODE_ENTER);
>       cluster.shutdown();
>       try {
>         // wal.writer.close() will throw an exception,
>         // but still call this since it closes the LogSyncer thread first
>         wal.close();
>       } catch (IOException e) {
>         LOG.info(e);
>       }
>       fs.close(); // closing FS last so DFSOutputStream can't call close
>       LOG.info("STOPPED first instance of the cluster");
>     } finally {
>       // Restart the cluster
>       while (cluster.isClusterUp()){
>         LOG.error("Waiting for cluster to go down");
>         Thread.sleep(1000);
>       }
> {code}
> Fix looks trivial, will include patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message