hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6846) NetworkTopology#sortByDistance should give nodes higher priority, which cache the block.
Date Mon, 18 Aug 2014 20:53:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101269#comment-14101269
] 

Jason Lowe commented on HDFS-6846:
----------------------------------

That seems like a reasonable compromise if we don't need to worry about overwhelming a rack-local
node.

I got the impression that the original problem behind HDFS-6268 was that the same rack-local
node always appeared first for all blocks of a file which caused load issues on that node.
 If a rack-local node cached all the blocks of a file then it seems like we'd be in the same
place as that JIRA.  But maybe I'm misunderstanding HDFS-6268 or for some reason we don't
need to worry about a single node getting all the blocks of a multi-block file cached.

> NetworkTopology#sortByDistance should give nodes higher priority, which cache the block.
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-6846
>                 URL: https://issues.apache.org/jira/browse/HDFS-6846
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.6.0
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>
> Currently there are 3 weights:
> * local
> * same rack
> * off rack
> But if some nodes cache the block, then it's faster if client read block from these nodes.
So we should have some more weights as following:
> * local
> * cached & same rack
> * same rack
> * cached & off rack
> * off rack



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message