hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7435) PB encoding of block reports is very inefficient
Date Tue, 25 Nov 2014 00:17:14 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14223815#comment-14223815
] 

Suresh Srinivas commented on HDFS-7435:
---------------------------------------

bq. While I agree it's a questionably nice to have feature
Not sure why you think it is questionably nice to have feature...

bq. If 20-30MB is going to cause a promotion failure in a namenode servicing a hundred millions
of blocks - it's already game over. 2.x easily generates over 1GB garbage/sec at a mere ~20k
ops/sec.
Just so that we are on the same page, java arrays requires contiguous arrays in memory. In
many installs, when namenode becomes unresponsive and datanodes end up sending block reports,
these block reports get promoted to older generation (because namenode is processing them
slowly). Since old generation may be fragmented, promotions can fail when large arrays need
to be promoted.

That said, I am fine doing it in a subsequent jira. It will end up touching the same parts
or replacing the code that you are adding.


> PB encoding of block reports is very inefficient
> ------------------------------------------------
>
>                 Key: HDFS-7435
>                 URL: https://issues.apache.org/jira/browse/HDFS-7435
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, namenode
>    Affects Versions: 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>         Attachments: HDFS-7435.patch
>
>
> Block reports are encoded as a PB repeating long.  Repeating fields use an {{ArrayList}}
with default capacity of 10.  A block report containing tens or hundreds of thousand of longs
(3 for each replica) is extremely expensive since the {{ArrayList}} must realloc many times.
 Also, decoding repeating fields will box the primitive longs which must then be unboxed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message