hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-13174) hdfs mover -p /path times out after 20 min
Date Mon, 07 May 2018 23:28:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-13174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16466611#comment-16466611
] 

Wei-Chiu Chuang commented on HDFS-13174:
----------------------------------------

Thanks for raising the issue, [~pifta]. The description makes sense to me. I'll review the
patch.

> hdfs mover -p /path times out after 20 min
> ------------------------------------------
>
>                 Key: HDFS-13174
>                 URL: https://issues.apache.org/jira/browse/HDFS-13174
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer &amp; mover
>    Affects Versions: 2.8.0, 2.7.4, 3.0.0-alpha2
>            Reporter: Istvan Fajth
>            Assignee: Istvan Fajth
>            Priority: Major
>         Attachments: HDFS-13174.001.patch
>
>
> In HDFS-11015 there is an iteration timeout introduced in Dispatcher.Source class, that
is checked during dispatching the moves that the Balancer and the Mover does. This timeout
is hardwired to 20 minutes.
> In the Balancer we have iterations, and even if an iteration is timing out the Balancer
runs further and does an other iteration before it fails if there were no moves happened in
a few iterations.
> The Mover on the other hand does not have iterations, so if moving a path runs for more
than 20 minutes, and there are moves decided and enqueued between two DataNode, after 20 minutes
Mover will stop with the following exception reported to the console (lines might differ as
this exception came from a CDH5.12.1 installation).
>  java.io.IOException: Block move timed out
>  at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:382)
>  at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:328)
>  at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2500(Dispatcher.java:186)
>  at org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:956)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  
> Note that this issue is not coming up if all blocks can be moved inside the DataNodes
without having to move the block to an other DataNode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message