hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Payne (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-1257) Race condition on FSNamesystem#recentInvalidateSets introduced by HADOOP-5124
Date Wed, 17 Aug 2011 19:35:27 GMT

     [ https://issues.apache.org/jira/browse/HDFS-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eric Payne updated HDFS-1257:
-----------------------------

    Attachment: HDFS-1257.5.20110817.patch

Removed TestProtectedBlockManager.java

The following manual tests were performed:

   1. On node 0, start the namenode daemon.
   2. On nodes 1-9, run org.apache.hadoop.hdfs.DataNodeCluster which uses MiniDFSCluster to
start 170 simulate datanodes per hardware node.
   3. On Client, run org.apache.hadoop.fs.loadGenerator.LoadGenerator for several minutes
to simulate a very heavy load of creates, writes, reads, and deletes.

RESULTS:
Before this patch, the NameNode would exit with ConcurrentModificationException while iterating
/ modifying the recentInvalidateSets.

After this patch, the NameNode stays stable for the length of the test.



> Race condition on FSNamesystem#recentInvalidateSets introduced by HADOOP-5124
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-1257
>                 URL: https://issues.apache.org/jira/browse/HDFS-1257
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Ramkumar Vadali
>            Assignee: Eric Payne
>             Fix For: 0.23.0
>
>         Attachments: HDFS-1257.1.20110810.patch, HDFS-1257.2.20110812.patch, HDFS-1257.3.20110815.patch,
HDFS-1257.4.20110816.patch, HDFS-1257.5.20110817.patch, HDFS-1257.patch
>
>
> HADOOP-5124 provided some improvements to FSNamesystem#recentInvalidateSets. But it introduced
unprotected access to the data structure recentInvalidateSets. Specifically, FSNamesystem.computeInvalidateWork
accesses recentInvalidateSets without read-lock protection. If there is concurrent activity
(like reducing replication on a file) that adds to recentInvalidateSets, the name-node crashes
with a ConcurrentModificationException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message