hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lukas Majercak (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-11499) Decommissioning stuck because of failing recovery
Date Sun, 05 Mar 2017 20:01:32 GMT
Lukas Majercak created HDFS-11499:
-------------------------------------

             Summary: Decommissioning stuck because of failing recovery
                 Key: HDFS-11499
                 URL: https://issues.apache.org/jira/browse/HDFS-11499
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs, namenode
    Affects Versions: 3.0.0-alpha2, 2.7.3, 2.7.2, 2.7.1
            Reporter: Lukas Majercak
            Assignee: Lukas Majercak


Block recovery will fail to finalize the file if the locations of the last, incomplete block
are being decommissioned. Vice versa, the decommissioning will be stuck, waiting for the last
block to be completed.

{code:xml}
org.apache.hadoop.ipc.RemoteException(java.lang.IllegalStateException): Failed to finalize
INodeFile testRecoveryFile since blocks[255] is non-complete, where blocks=[blk_1073741825_1001,
blk_1073741826_1002...
{code}

The fix is to count replicas on decommissioning nodes when completing last block in BlockManager.commitOrCompleteLastBlock,
as we know that the DecommissionManager will not decommission a node that has UC blocks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message