hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-1562) Add rack policy tests
Date Mon, 18 Apr 2011 22:31:07 GMT

     [ https://issues.apache.org/jira/browse/HDFS-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eli Collins updated HDFS-1562:
------------------------------

    Attachment: hdfs-1562-3.patch

Hey Matt,

Thanks for reviewing! Updated patch attached.

* Addresses HDFS-1828 by making waitForReplication check for exact values
* Added a comment by each config option being set with rationale
* Folds all utility methods into DFSTestUtil. I used the NameNodeAdatper for waitForReplication
since it uses protected methods. This method is needed in addition to waitReplication because
it checks for specific values of neededReplications not exposed via the FileSystem API (the
test is more fine-grain).
* Good point WRT waitForCorruptReplicas. The test actually has the opposite problem, it explicitly
attempts to report the corrupt replica from the client (via file access) because the datanode
checking takes so long (the DataBlockScanner period is measured in hours, it doesn't execute
during the test runs). In the test, after the client reports the corrupt block to the Namenode
it immediately queries the namenode state to check that a corrupt replica has been identified
so it can wait for replication. After looping this test however I discovered a problem with
this approach too, sometimes the client only accesses the non-corrupt block location and therefore
doesn't trigger the detection of the corrupt replica. The code for testing corrupt replicas
in TestDatanodeBlockScanner (restart the DN which will trigger block scanning) looks sound,
I refatored it out to a new method (DFSTestUtil#waitCorruptReplicas) and used it here. 
* Also refactored TestDatanodeBlockScanner to use waitReplication and new methods waitCorruptReplicas
and isBlockCorrupt. 
* Removes TestDataNodeBlockScanner#corruptReplica in favor of MiniDFSCluster#corruptReplica
(same implementation)

I've looped the test using this patch and so far have seen no failures.

Thanks,
Eli

> Add rack policy tests
> ---------------------
>
>                 Key: HDFS-1562
>                 URL: https://issues.apache.org/jira/browse/HDFS-1562
>             Project: Hadoop HDFS
>          Issue Type: Test
>          Components: name-node, test
>    Affects Versions: 0.23.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>         Attachments: hdfs-1562-1.patch, hdfs-1562-2.patch, hdfs-1562-3.patch
>
>
> The existing replication tests (TestBlocksWithNotEnoughRacks, TestPendingReplication,
TestOverReplicatedBlocks, TestReplicationPolicy, TestUnderReplicatedBlocks, and TestReplication)
are missing tests for rack policy violations.  This jira adds the following tests which I
created when generating a new patch for HDFS-15.
> * Test that blocks that have a sufficient number of total replicas, but are not replicated
cross rack, get replicated cross rack when a rack becomes available.
> * Test that new blocks for an underreplicated file will get replicated cross rack. 
> * Mark a block as corrupt, test that when it is re-replicated that it is still replicated
across racks.
> * Reduce the replication factor of a file, making sure that the only block that is across
racks is not removed when deleting replicas.
> * Test that when a block is replicated because a replica is lost due to host failure
the the rack policy is preserved.
> * Test that when the execss replicas of a block are reduced due to a node re-joining
the cluster the rack policy is not violated.
> * Test that rack policy is still respected when blocks are replicated due to node decommissioning.
> * Test that rack policy is still respected when blocks are replicated due to node decommissioning,
even when the blocks are over-replicated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message