hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zsolt Venczel (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-13339) Volume reference can't be released and leads to deadlock when DataXceiver does a check volume
Date Sat, 02 Jun 2018 21:42:00 GMT

     [ https://issues.apache.org/jira/browse/HDFS-13339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zsolt Venczel updated HDFS-13339:
---------------------------------
    Attachment:     (was: HDFS-13339.004.patch)

> Volume reference can't be released and leads to deadlock when DataXceiver does a check
volume
> ---------------------------------------------------------------------------------------------
>
>                 Key: HDFS-13339
>                 URL: https://issues.apache.org/jira/browse/HDFS-13339
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>         Environment: os: Linux 2.6.32-358.el6.x86_64
> hadoop version: hadoop-3.2.0-SNAPSHOT
> unit: mvn test -Pnative -Dtest=TestDataNodeVolumeFailureReporting#testVolFailureStatsPreservedOnNNRestart
>            Reporter: liaoyuxiangqin
>            Assignee: liaoyuxiangqin
>            Priority: Critical
>              Labels: DataNode, volumes
>         Attachments: HDFS-13339.001.patch, HDFS-13339.002.patch, HDFS-13339.003.patch
>
>
> When i execute Unit Test of
>  TestDataNodeVolumeFailureReporting#testVolFailureStatsPreservedOnNNRestart, the process
blocks on waitReplication, detail information as follows:
> [INFO] -------------------------------------------------------
>  [INFO] T E S T S
>  [INFO] -------------------------------------------------------
>  [INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
>  [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 307.492 s <<<
FAILURE! - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
>  [ERROR] testVolFailureStatsPreservedOnNNRestart(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting)
Time elapsed: 307.206 s <<< ERROR!
>  java.util.concurrent.TimeoutException: Timed out waiting for /test1 to reach 2 replicas
>  at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:800)
>  at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.testVolFailureStatsPreservedOnNNRestart(TestDataNodeVolumeFailureReporting.java:283)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>  at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>  at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message