hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo Nicholas Sze (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
Date Sat, 05 Sep 2015 04:56:46 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14731799#comment-14731799
] 

Tsz Wo Nicholas Sze commented on HDFS-9011:
-------------------------------------------

It seems there is a bug: for each partial report rpc, NN calls reportDiff(..) but reportDiff(..)
assumes full block report.  I think the diff is incorrect for a partial report.  In particular,
the toRemove set may contain some blocks reported by other rpcs.

> Support splitting BlockReport of a storage into multiple RPC
> ------------------------------------------------------------
>
>                 Key: HDFS-9011
>                 URL: https://issues.apache.org/jira/browse/HDFS-9011
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>         Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, HDFS-9011.002.patch
>
>
> Currently if a DataNode has too many blocks (more than 1m by default), it sends multiple
RPC to the NameNode for the block report, each RPC contains report for a single storage. However,
in practice we've seen sometimes even a single storage can contains large amount of blocks
and the report even exceeds the max RPC data length. It may be helpful to support sending
multiple RPC for the block report of a storage. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message