hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Payne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1257) Race condition on FSNamesystem#recentInvalidateSets introduced by HADOOP-5124
Date Mon, 15 Aug 2011 20:47:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085319#comment-13085319
] 

Eric Payne commented on HDFS-1257:
----------------------------------

Hi Nicholas. Thank you for reviewing this patch.

> belongsToInvalidates(..) is not synchronized.
> In dumpRecentInvalidateSets(..), recentInvalidateSets.values().size() is not synchronized.
> In invalidateWorkForOneNode(..), recentInvalidateSets.get(nodeId) is not synchronized.
My thinking was that since these are "gets" and not iterations or modifications, they didn't
need to be synchronized. However, that assumption is based on a dependency that the underlying
implementation doesn't iterate in order to do the "get" or "size". So, I went ahead and put
a synchronized() around these places as well, just as you suggested.


> Please use junit 4 in the test (i.e. org.junit.* instead of junit.framework.*).
Done


> Suggestion: how about move recentInvalidateSets and the related method to a standalone
class?
I looked into doing this, and the methods that manipulate the recentInvalidateSets object
are fairly dependent on and intertwined with the BlockManager class. These methods use several
private BlockManager variables, for example. I think the separation is do-able, but introduces
more risk and would require more care and testing. Can we create a separate Jira for this?


> Race condition on FSNamesystem#recentInvalidateSets introduced by HADOOP-5124
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-1257
>                 URL: https://issues.apache.org/jira/browse/HDFS-1257
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Ramkumar Vadali
>            Assignee: Eric Payne
>             Fix For: 0.23.0
>
>         Attachments: HDFS-1257.1.20110810.patch, HDFS-1257.2.20110812.patch, HDFS-1257.3.20110815.patch,
HDFS-1257.patch
>
>
> HADOOP-5124 provided some improvements to FSNamesystem#recentInvalidateSets. But it introduced
unprotected access to the data structure recentInvalidateSets. Specifically, FSNamesystem.computeInvalidateWork
accesses recentInvalidateSets without read-lock protection. If there is concurrent activity
(like reducing replication on a file) that adds to recentInvalidateSets, the name-node crashes
with a ConcurrentModificationException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message