hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8574) When block count for a volume exceeds dfs.blockreport.split.threshold, block report causes exception
Date Thu, 11 Jun 2015 14:29:01 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581979#comment-14581979
] 

Arpit Agarwal commented on HDFS-8574:
-------------------------------------

Hi [~ajithshetty], 

Thanks for reporting this. 

bq. there is one volume which contains files way more than dfs.blockreport.split.threshold
(may be 10 times)
The default value of {{dfs.blockreport.split.threshold}} is 1 Million. Even if you have a
10TB drive, 10 Million blocks per drive gives a mean block size of 1MB. HDFS is not designed
to deal well with large numbers of such tiny blocks. Would you mind sharing some metrics about
your target use case?

> When block count for a volume exceeds dfs.blockreport.split.threshold, block report causes
exception
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8574
>                 URL: https://issues.apache.org/jira/browse/HDFS-8574
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.7.0
>            Reporter: Ajith S
>            Assignee: Ajith S
>
> This piece of code in {{org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport()}}
> {code}
> // Send one block report per message.
>         for (int r = 0; r < reports.length; r++) {
>           StorageBlockReport singleReport[] = { reports[r] };
>           DatanodeCommand cmd = bpNamenode.blockReport(
>               bpRegistration, bpos.getBlockPoolId(), singleReport,
>               new BlockReportContext(reports.length, r, reportId));
>           numReportsSent++;
>           numRPCs++;
>           if (cmd != null) {
>             cmds.add(cmd);
>           }
> {code}
> when a single volume contains many blocks, i.e more than the threshold, it is trying
to send the entire blockreport in one RPC, causing exception
> {code}
> java.lang.IllegalStateException: com.google.protobuf.InvalidProtocolBufferException:
Protocol message was too large.  May be malicious.  Use CodedInputStream.setSizeLimit() to
increase the size limit.
>         at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:369)
>         at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:347)
>         at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:325)
>         at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:190)
>         at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport(BPServiceActor.java:473)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message