hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5124) A few optimizations to FsNamesystem#RecentInvalidateSets
Date Mon, 02 Feb 2009 23:33:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669800#action_12669800
] 

Konstantin Shvachko commented on HADOOP-5124:
---------------------------------------------

# {{computeInvalidateWork()}}
## You probably want to use {{Math.min()}} in computing the value of {{nodesToProcess}}
## I would rather go with 
{{ArrayList<String> keyArray = new ArrayList<String>(recentInvalidateSets.keySet());}}

than {{String[] keyArray}}. You will be able to use {{Collections.swap()}} instead of implementing
it yourself.
Ideally it would be better of course to just get a random element from the TreeMap and put
it into the array list.
# {{invalidateWorkForOneNode()}}
{code}
    if(it.hasNext())
      recentInvalidateSets.put(firstNodeId, invalidateSet);
{code}
Is a no op in your case, because {{recentInvalidateSets}} already contains {{firstNodeId}}
with exactly {{invalidateSet}} as it was modified before in the loop.
The original variant of this code
{code}
    if(!it.hasNext())
      recentInvalidateSets.remove(nodeId);
{code}
makes more sense since we remove the entire node if it does not have invalid blocks anymore.
# Could you please run some tests showing how much of optimization we can get with the randomization
of data-node selection.

> A few optimizations to FsNamesystem#RecentInvalidateSets
> --------------------------------------------------------
>
>                 Key: HADOOP-5124
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5124
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: optimizeInvalidate.patch, optimizeInvalidate1.patch
>
>
> This jira proposes a few optimization to FsNamesystem#RecentInvalidateSets:
> 1. when removing all replicas of a block, it does not traverse all nodes in the map.
Instead it traverse only the nodes that the block is located.
> 2. When dispatching blocks to datanodes in ReplicationMonitor. It randomly chooses a
predefined number of datanodes and dispatches blocks to those datanodes. This strategy provides
fairness to all datanodes. The current strategy always starts from the first datanode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message