hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5464) Simplify block report diff calculation
Date Tue, 15 Jul 2014 22:51:06 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062798#comment-14062798

Andrew Wang commented on HDFS-5464:

Considering we'll have 8 or 10TB disks in a few years, we could be seeing a lot more than
just 500k blocks per DN. Storage dense nodes with 24+ disks are also out there. Memory accesses
and conditional branches are also expensive. If we were just adding 500k integers together,
it's not a big deal, but this loop is doing more than that.

I'm not opposed in principle to this change since it is simpler and the same time complexity,
but I'd like to see some microbenchmark results before committing it. Maybe rig something
up with JMH?

> Simplify block report diff calculation
> --------------------------------------
>                 Key: HDFS-5464
>                 URL: https://issues.apache.org/jira/browse/HDFS-5464
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Tsz Wo Nicholas Sze
>            Priority: Minor
>         Attachments: h5464_20131105.patch, h5464_20131105b.patch, h5464_20131105c.patch,
h5464_20140715.patch, h5464_20140715b.patch
> The current calculation in BlockManager.reportDiff(..) is unnecessarily complicated.
 We could simplify the calculation.

This message was sent by Atlassian JIRA

View raw message