hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lisheng Sun (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-14283) DFSInputStream to prefer cached replica
Date Tue, 10 Sep 2019 03:20:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16926296#comment-16926296

Lisheng Sun commented on HDFS-14283:

[~smeng] i are working on this jira.  upload this patch later. Thank you.

> DFSInputStream to prefer cached replica
> ---------------------------------------
>                 Key: HDFS-14283
>                 URL: https://issues.apache.org/jira/browse/HDFS-14283
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.6.0
>         Environment: HDFS Caching
>            Reporter: Wei-Chiu Chuang
>            Assignee: Lisheng Sun
>            Priority: Major
> HDFS Caching offers performance benefits. However, currently NameNode does not treat
cached replica with higher priority, so HDFS caching is only useful when cache replication
= 3, that is to say, all replicas are cached in memory, so that a client doesn't randomly
pick an uncached replica.
> HDFS-6846 proposed to let NameNode give higher priority to cached replica. Changing
a logic in NameNode is always tricky so that didn't get much traction. Here I propose a different
approach: let client (DFSInputStream) prefer cached replica.
> A {{LocatedBlock}} object already contains cached replica location so a client has the
needed information. I think we can change {{DFSInputStream#getBestNodeDNAddrPair()}} for
this purpose.

This message was sent by Atlassian Jira

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message