hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhaoyunjiong (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-5579) Under construction files make DataNode decommission take very long hours
Date Thu, 28 Nov 2013 06:27:35 GMT

     [ https://issues.apache.org/jira/browse/HDFS-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

zhaoyunjiong updated HDFS-5579:

    Attachment: HDFS-5579.patch

This patch let NameNode can replicate blocks belongs to under construction files but not the
last block.
And if the decommissioning DataNodes only have some blocks which are the last blocks of under
construction files and have more than 1 live replicates left behind, then NameNode could set

> Under construction files make DataNode decommission take very long hours
> ------------------------------------------------------------------------
>                 Key: HDFS-5579
>                 URL: https://issues.apache.org/jira/browse/HDFS-5579
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 1.2.0, 2.2.0
>            Reporter: zhaoyunjiong
>            Assignee: zhaoyunjiong
>         Attachments: HDFS-5579-branch-1.2.patch, HDFS-5579.patch
> We noticed that some times decommission DataNodes takes very long time, even exceeds
100 hours.
> After check the code, I found that in BlockManager:computeReplicationWorkForBlocks(List<List<Block>>
blocksToReplicate) it won't replicate blocks which belongs to under construction files, however
in BlockManager:isReplicationInProgress(DatanodeDescriptor srcNode), if there  is block need
replicate no matter whether it belongs to under construction or not, the decommission progress
will continue running.
> That's the reason some time the decommission takes very long time.

This message was sent by Atlassian JIRA

View raw message