hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6268) Better sorting in NetworkTopology#pseudoSortByDistance when no local node is found
Date Thu, 01 May 2014 22:39:15 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987104#comment-13987104

Andrew Wang commented on HDFS-6268:

Thanks for the reviews everyone. Answers to your comments inline:

bq. Do we need to put another random one at the second position? 

Clients read from the nodes in order, so our readers will all be using the random first node.
The second node only comes into play during a failure (or a hedged read I guess). Because
we sometimes set the seed and this algo is deterministic, we could get stuck selecting the
same random node for a block, but that's actually desirable. What we want to avoid is the
same ingest node getting chosen for every block, which the randomness should prevent.

bq. It seems to me that the second sort (Array.sort) would change the result of the first
(pesudoSortByDistance). Why the sorting is not consolidated into one to consider both distance
and decommission/stale factors?

I suppose we could, I'll consider it if I rev the patch.

bq. This would remove one level of indentation, and make it easier to read.

The bit at the end about choosing a random node is not within this if block, so I don't think
we can make this change.

ATM, I read this code and had the same thought. It would be cleaner and less corner-casey
if we first binned by network distance, then randomized each bin. I didn't make this change
since this is a hot code path and it'd be a bit slower, but since we're typically dealing
with 3 replicas, I can't imagine it making a big difference. We could also potentially fold
in the decom/stale state too, and get better locality for these edge cases. If you agree with
this assessment, I'll redo this patch as per above.

> Better sorting in NetworkTopology#pseudoSortByDistance when no local node is found
> ----------------------------------------------------------------------------------
>                 Key: HDFS-6268
>                 URL: https://issues.apache.org/jira/browse/HDFS-6268
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.4.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>            Priority: Minor
>         Attachments: hdfs-6268-1.patch, hdfs-6268-2.patch
> In NetworkTopology#pseudoSortByDistance, if no local node is found, it will always place
the first rack local node in the list in front.
> This became an issue when a dataset was loaded from a single datanode. This datanode
ended up being the first replica for all the blocks in the dataset. When running an Impala
query, the non-local reads when reading past a block boundary were all hitting this node,
meaning massive load skew.

This message was sent by Atlassian JIRA

View raw message