hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-1155) Additional performance improvement to chooseTarget
Date Tue, 10 Apr 2007 23:39:32 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Hairong Kuang updated HADOOP-1155:

    Attachment: rackMap2.patch

This patch makes 4 changes:

1. add a rackname to its rack node map in NetworkTopology to speed up getNode
2. optimize sortedByDistance by taking Sameer's suggestion in HADOOP-1073:
>   Do we need to sort datanodes by distance? Why not just do a linear scan for the on
node and on rack instances, put them at the front of the pipeline and leave the rest in random
    This suggestion allows us to reduce memeory allocation and the # of calls to getDistance.
3. add a test case to test sortedByDistance
4. change chooseRandom to return a list instead of an array. This allows us to reduce one
memory allocation.

> Additional performance improvement to chooseTarget
> --------------------------------------------------
>                 Key: HADOOP-1155
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1155
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>             Fix For: 0.13.0
>         Attachments: rackMap.patch, rackMap1.patch, rackMap2.patch
> A few additional thoughts to improve the performance of chooseTarget:
> 1. Reduce the # of calls to getDistance in sortedByDistance
> 2. Improve the performance of getNode by adding a rack name to rack node map

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message