hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinayakumar B (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9600) do not check replication if the block is under construction
Date Thu, 07 Jan 2016 05:51:40 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086858#comment-15086858

Vinayakumar B commented on HDFS-9600:

bq. The native build fails when libwebhdfs in contrib is built. This is not the case if you
simply do -Pnative. I think it is HDFS-8346.
Might be another reason for failure in branch-2.
But I have seen with both docker and non-docker mode in branch-2.6. It fails in branch-2.6
with docker mode because dev-support/DockerFile doesnot exist in branch-2.6.

> do not check replication if the block is under construction
> -----------------------------------------------------------
>                 Key: HDFS-9600
>                 URL: https://issues.apache.org/jira/browse/HDFS-9600
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Phil Yang
>            Assignee: Phil Yang
>            Priority: Critical
>         Attachments: HDFS-9600-branch-2.6.patch, HDFS-9600-branch-2.7.patch, HDFS-9600-branch-2.patch,
HDFS-9600-v1.patch, HDFS-9600-v2.patch, HDFS-9600-v3.patch, HDFS-9600-v4.patch
> When appending a file, we will update pipeline to bump a new GS and the old GS will be
considered as out of date. When changing GS, in BlockInfo.setGenerationStampAndVerifyReplicas
we will remove replicas having old GS which means we will remove all replicas because no DN
has new GS until the block with new GS is added to blockMaps again by DatanodeProtocol.blockReceivedAndDeleted.
> If we check replication of this block before it is added back, it will be regarded as
missing. The probability is low but if there are decommissioning nodes the DecommissionManager.Monitor
will scan all blocks belongs to decommissioning nodes with a very fast speed so the probability
of finding missing block is very high but actually they are not missing. 
> Furthermore, after closing the appended file, in FSNamesystem.finalizeINodeFileUnderConstruction,
it will checkReplication. If some of nodes are decommissioning, this block with new GS will
be added to UnderReplicatedBlocks map so there are two blocks with same ID in this map, one
And there will be many missing blocks warning in NameNode website but there is no corrupt
> Therefore, I think the solution is we should not check replication if the block is under
construction. We only check complete blocks.

This message was sent by Atlassian JIRA

View raw message