hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ajith S (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8574) When block count for a volume exceeds dfs.blockreport.split.threshold, block report causes exception
Date Tue, 16 Jun 2015 02:02:02 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587324#comment-14587324

Ajith S commented on HDFS-8574:

Hi [~arpitagarwal]

Thanks for the input. Yes you are right, HDFS was not designed for tiny blocks. My scenario
was like, i wanted to test the NN limits so i inserted 10 million files with size ~10KB(10KB
because i had smaller disk). My DN had one {{data.dir}} directory, when i faced this exception.
But when i increased the {{data.dir}} to 5, the issue was resolved. Later i checked and came
across this piece of code where the block report was sent per volume of DN. My question is
when we check for overflow, based on number of blocks, then why we split based on report,
as in a single report, there might be still overflow for given limit {{dfs.blockreport.split.threshold}}

Please correct me if i am wrong

> When block count for a volume exceeds dfs.blockreport.split.threshold, block report causes
> ----------------------------------------------------------------------------------------------------
>                 Key: HDFS-8574
>                 URL: https://issues.apache.org/jira/browse/HDFS-8574
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.7.0
>            Reporter: Ajith S
>            Assignee: Ajith S
> This piece of code in {{org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport()}}
> {code}
> // Send one block report per message.
>         for (int r = 0; r < reports.length; r++) {
>           StorageBlockReport singleReport[] = { reports[r] };
>           DatanodeCommand cmd = bpNamenode.blockReport(
>               bpRegistration, bpos.getBlockPoolId(), singleReport,
>               new BlockReportContext(reports.length, r, reportId));
>           numReportsSent++;
>           numRPCs++;
>           if (cmd != null) {
>             cmds.add(cmd);
>           }
> {code}
> when a single volume contains many blocks, i.e more than the threshold, it is trying
to send the entire blockreport in one RPC, causing exception
> {code}
> java.lang.IllegalStateException: com.google.protobuf.InvalidProtocolBufferException:
Protocol message was too large.  May be malicious.  Use CodedInputStream.setSizeLimit() to
increase the size limit.
>         at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:369)
>         at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:347)
>         at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:325)
>         at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:190)
>         at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport(BPServiceActor.java:473)
> {code}

This message was sent by Atlassian JIRA

View raw message