hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Foley (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-1878) race condition in FSNamesystem.close() causes NullPointerException without serious consequence - TestHDFSServerPorts unit test failure
Date Tue, 03 May 2011 02:18:03 GMT
race condition in FSNamesystem.close() causes NullPointerException without serious consequence
- TestHDFSServerPorts unit test failure
--------------------------------------------------------------------------------------------------------------------------------------

                 Key: HDFS-1878
                 URL: https://issues.apache.org/jira/browse/HDFS-1878
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: name-node
    Affects Versions: 0.20.204.0
            Reporter: Matt Foley
            Assignee: Matt Foley
            Priority: Minor
             Fix For: 0.20.205.0


TestHDFSServerPorts was observed to intermittently throw a NullPointerException.  This only
happens when FSNamesystem.close() is called, which means system termination for the Namenode,
so this is not a serious bug for .204.  TestHDFSServerPorts is more likely than normal execution
to stimulate the race, because it runs two Namenodes in the same JVM, causing more interleaving
and more potential to see a race condition.

The race is in FSNamesystem.close(), line 566, we have:
      if (replthread != null) replthread.interrupt();
      if (replmon != null) replmon = null;

Since the interrupted replthread is not waited on, there is a potential race condition with
replmon being nulled before replthread is dead, but replthread references replmon in computeDatanodeWork()
where the NullPointerException occurs.

The solution is either to wait on replthread or just don't null replmon.  The latter is preferred,
since none of the sibling Namenode processing threads are waited on in close().

I'll attach a patch for .205.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message