[ https://issues.apache.org/jira/browse/HDFS-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eric Payne updated HDFS-1257:
-----------------------------
Attachment: HDFS-1257.5.20110817.patch
Removed TestProtectedBlockManager.java
The following manual tests were performed:
1. On node 0, start the namenode daemon.
2. On nodes 1-9, run org.apache.hadoop.hdfs.DataNodeCluster which uses MiniDFSCluster to
start 170 simulate datanodes per hardware node.
3. On Client, run org.apache.hadoop.fs.loadGenerator.LoadGenerator for several minutes
to simulate a very heavy load of creates, writes, reads, and deletes.
RESULTS:
Before this patch, the NameNode would exit with ConcurrentModificationException while iterating
/ modifying the recentInvalidateSets.
After this patch, the NameNode stays stable for the length of the test.
> Race condition on FSNamesystem#recentInvalidateSets introduced by HADOOP-5124
> -----------------------------------------------------------------------------
>
> Key: HDFS-1257
> URL: https://issues.apache.org/jira/browse/HDFS-1257
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: name-node
> Affects Versions: 0.23.0
> Reporter: Ramkumar Vadali
> Assignee: Eric Payne
> Fix For: 0.23.0
>
> Attachments: HDFS-1257.1.20110810.patch, HDFS-1257.2.20110812.patch, HDFS-1257.3.20110815.patch,
HDFS-1257.4.20110816.patch, HDFS-1257.5.20110817.patch, HDFS-1257.patch
>
>
> HADOOP-5124 provided some improvements to FSNamesystem#recentInvalidateSets. But it introduced
unprotected access to the data structure recentInvalidateSets. Specifically, FSNamesystem.computeInvalidateWork
accesses recentInvalidateSets without read-lock protection. If there is concurrent activity
(like reducing replication on a file) that adds to recentInvalidateSets, the name-node crashes
with a ConcurrentModificationException.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
|