hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rakesh R (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-11164) Mover should avoid unnecessary retries if the block is pinned
Date Tue, 22 Nov 2016 18:04:58 GMT

     [ https://issues.apache.org/jira/browse/HDFS-11164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rakesh R updated HDFS-11164:
----------------------------
    Description: 
When mover is trying to move a pinned block to another datanode, it will internally hits the
following IOException and mark the block movement as {{failure}}. Since the Mover has {{dfs.mover.retry.max.attempts}}
configs, it will continue moving this block until it reaches {{retryMaxAttempts}}. If the
block movement failure(s) are only due to block pinning, then retry is unnecessary. The idea
of this jira is to avoid retry attempts of pinned blocks as they won't be able to move to
a different node. 

{code}
2016-11-22 10:56:10,537 WARN org.apache.hadoop.hdfs.server.balancer.Dispatcher: Failed to
move blk_1073741825_1001 with size=52 from 127.0.0.1:19501:DISK to 127.0.0.1:19758:ARCHIVE
through 127.0.0.1:19501
java.io.IOException: Got error, status=ERROR, status message opReplaceBlock BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001
received exception java.io.IOException: Got error, status=ERROR, status message Not able to
copy block 1073741825 to /127.0.0.1:19826 because it's pinned , copy block BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001
from /127.0.0.1:19501, reportedBlock move is failed
	at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:118)
	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:417)
	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:358)
	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$5(Dispatcher.java:322)
	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:1075)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
{code}

  was:
When mover is trying to move a pinned block to another datanode, it will internally hits the
following IOException and mark the block movement as {{failure}}. Since the Mover has {{dfs.mover.retry.max.attempts}}
configs, it will continue moving this block until it reaches {{retryMaxAttempts}}. This retry
is unnecessary and would be good to avoid retry attempts as pinned block won't be able to
move.

{code}
2016-11-22 10:56:10,537 WARN org.apache.hadoop.hdfs.server.balancer.Dispatcher: Failed to
move blk_1073741825_1001 with size=52 from 127.0.0.1:19501:DISK to 127.0.0.1:19758:ARCHIVE
through 127.0.0.1:19501
java.io.IOException: Got error, status=ERROR, status message opReplaceBlock BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001
received exception java.io.IOException: Got error, status=ERROR, status message Not able to
copy block 1073741825 to /127.0.0.1:19826 because it's pinned , copy block BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001
from /127.0.0.1:19501, reportedBlock move is failed
	at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:118)
	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:417)
	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:358)
	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$5(Dispatcher.java:322)
	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:1075)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
{code}


> Mover should avoid unnecessary retries if the block is pinned
> -------------------------------------------------------------
>
>                 Key: HDFS-11164
>                 URL: https://issues.apache.org/jira/browse/HDFS-11164
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer & mover
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: HDFS-11164-00.patch, HDFS-11164-01.patch
>
>
> When mover is trying to move a pinned block to another datanode, it will internally hits
the following IOException and mark the block movement as {{failure}}. Since the Mover has
{{dfs.mover.retry.max.attempts}} configs, it will continue moving this block until it reaches
{{retryMaxAttempts}}. If the block movement failure(s) are only due to block pinning, then
retry is unnecessary. The idea of this jira is to avoid retry attempts of pinned blocks as
they won't be able to move to a different node. 
> {code}
> 2016-11-22 10:56:10,537 WARN org.apache.hadoop.hdfs.server.balancer.Dispatcher: Failed
to move blk_1073741825_1001 with size=52 from 127.0.0.1:19501:DISK to 127.0.0.1:19758:ARCHIVE
through 127.0.0.1:19501
> java.io.IOException: Got error, status=ERROR, status message opReplaceBlock BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001
received exception java.io.IOException: Got error, status=ERROR, status message Not able to
copy block 1073741825 to /127.0.0.1:19826 because it's pinned , copy block BP-1772076264-10.252.146.200-1479792322960:blk_1073741825_1001
from /127.0.0.1:19501, reportedBlock move is failed
> 	at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:118)
> 	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.receiveResponse(Dispatcher.java:417)
> 	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:358)
> 	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$5(Dispatcher.java:322)
> 	at org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:1075)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message