hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-11511) Support Timeout when checking single disk
Date Sun, 12 Mar 2017 00:27:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906369#comment-15906369
] 

Arpit Agarwal edited comment on HDFS-11511 at 3/12/17 12:26 AM:
----------------------------------------------------------------

Comments on the unit tests:
# Can you please add timeouts on TestDatasetVolumeCheckerTimeout? You can either add {{@timeout}}
annotations on each test method or add a rule at the class level. e.g.
{code}
  @Rule
  public final Timeout testTimeout = new Timeout(300_000);
{code}
# Instead of calling {{Thread.sleep(DISK_CHECK_TIME)}}, the check callback of the slow volume
can wait on a mutex. testDiskCheckTimeout should unblock the waiting thread by signalling
the mutex before exiting.
# Looks like TestDatasetVolumeCheckerTimeout#IGNORED_CONTEXT can be removed.
# Nitpick: need space after {{//}} at {//Assert}}. Also at {{//Delay}} and {{//Wait}}.

Two tests failed for me:
{code}
testDiskCheckTimeout(org.apache.hadoop.hdfs.server.datanode.checker.TestDatasetVolumeCheckerTimeout)
 Time elapsed: 0.474 sec  <<< FAILURE!
java.lang.AssertionError:
Expected: is <1L>
     but: was <0L>
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.junit.Assert.assertThat(Assert.java:865)
	at org.junit.Assert.assertThat(Assert.java:832)
	at org.apache.hadoop.hdfs.server.datanode.checker.TestDatasetVolumeCheckerTimeout.testDiskCheckTimeout(TestDatasetVolumeCheckerTimeout.java:121)

testTimeoutExceptionIsThrown(org.apache.hadoop.hdfs.server.datanode.checker.TestDatasetVolumeCheckerTimeout)
 Time elapsed: 0.04 sec  <<< FAILURE!
java.lang.AssertionError:
Expected: is <1>
     but: was <0>
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.junit.Assert.assertThat(Assert.java:865)
	at org.junit.Assert.assertThat(Assert.java:832)
	at org.apache.hadoop.hdfs.server.datanode.checker.TestDatasetVolumeCheckerTimeout.testTimeoutExceptionIsThrown(TestDatasetVolumeCheckerTimeout.java:162)
{code}

The test coverage looks good. One suggestion - we can move the new unit tests to a TestThrottledAsyncCheckerWithTimeout
class, and retain one unit test in TestDatasetVolumeCheckerTimeout to verify that the DFS_DATANODE_DISK_CHECK_TIMEOUT_KEY
setting is respected. 




was (Author: arpitagarwal):
Comments on the unit tests:
# Can you please add timeouts on TestDatasetVolumeCheckerTimeout? You can either add {{@timeout}}
annotations on each test method or add an rule at the class level. e.g.
{code}
  @Rule
  public final Timeout testTimeout = new Timeout(300_000);
{code}
# Instead of calling {{Thread.sleep(DISK_CHECK_TIME)}}, the check callback of the slow volume
can wait on a mutex. testDiskCheckTimeout should unblock the waiting thread by signalling
the mutex before exiting.
# Looks like TestDatasetVolumeCheckerTimeout#IGNORED_CONTEXT can be removed.
# Nitpick: need space after {{//}} at {//Assert}}. Also at {{//Delay}} and {{//Wait}}.

Two tests failed for me:
{code}
testDiskCheckTimeout(org.apache.hadoop.hdfs.server.datanode.checker.TestDatasetVolumeCheckerTimeout)
 Time elapsed: 0.474 sec  <<< FAILURE!
java.lang.AssertionError:
Expected: is <1L>
     but: was <0L>
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.junit.Assert.assertThat(Assert.java:865)
	at org.junit.Assert.assertThat(Assert.java:832)
	at org.apache.hadoop.hdfs.server.datanode.checker.TestDatasetVolumeCheckerTimeout.testDiskCheckTimeout(TestDatasetVolumeCheckerTimeout.java:121)

testTimeoutExceptionIsThrown(org.apache.hadoop.hdfs.server.datanode.checker.TestDatasetVolumeCheckerTimeout)
 Time elapsed: 0.04 sec  <<< FAILURE!
java.lang.AssertionError:
Expected: is <1>
     but: was <0>
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.junit.Assert.assertThat(Assert.java:865)
	at org.junit.Assert.assertThat(Assert.java:832)
	at org.apache.hadoop.hdfs.server.datanode.checker.TestDatasetVolumeCheckerTimeout.testTimeoutExceptionIsThrown(TestDatasetVolumeCheckerTimeout.java:162)
{code}

The test coverage looks good. One suggestion - we can move the new unit tests to a TestThrottledAsyncCheckerWithTimeout
class, and retain one unit test in TestDatasetVolumeCheckerTimeout to verify that the DFS_DATANODE_DISK_CHECK_TIMEOUT_KEY
setting is respected. 



> Support Timeout when checking single disk
> -----------------------------------------
>
>                 Key: HDFS-11511
>                 URL: https://issues.apache.org/jira/browse/HDFS-11511
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>            Reporter: Hanisha Koneru
>            Assignee: Hanisha Koneru
>         Attachments: HDFS-11511.000.patch, HDFS-11511.001.patch
>
>
> HDFS-11149 introduces parallel checking of FsVolumes by Datanode. Disk checks should
have a configurable timeout so that bad disks do not hang and cause long running checks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message