hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1300) deletion of excess replicas does not take into account 'rack-locality'
Date Thu, 07 Jun 2007 17:09:26 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12502431

dhruba borthakur commented on HADOOP-1300:

+1 for the code. Looks good.

I still think that the algorithm is CPU-heavy in the case when there are many replicas to
delete. But I do not have an alternative lighter-wright algorithm. HDFS allows an application
to set the replication factor of a file. Even if dfs.replication.max is set to something reasonable
(e.g. 40), the excess replica deletion could potentially need to get rid of 30 odd replicas.
In this case, it could eat considerable CPU. This was the reason why I suggested that we can
move this method outside the FSNamesystem lock.

> deletion of excess replicas does not take into account 'rack-locality'
> ----------------------------------------------------------------------
>                 Key: HADOOP-1300
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1300
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Koji Noguchi
>            Assignee: Hairong Kuang
>         Attachments: excessDel.patch
> One rack went down today, resulting in one missing block/file.
> Looking at the log, this block was originally over-replicated. 
> 3 replicas on one rack and 1 replica on another.
> Namenode decided to delete the latter, leaving 3 replicas on the same rack.
> It'll be nice if the deletion is also rack-aware.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message