hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liang Xie (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5917) Have an ability to refresh deadNodes list periodically
Date Wed, 12 Feb 2014 01:50:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898645#comment-13898645

Liang Xie commented on HDFS-5917:

[~decster], thanks for your comments! yeh, i got your concern absolutely, my understanding
1) we need the  deadNodesRefreshIntervalMs, since we don't know the deadNodes size, we could
not always assume it's only have one or two entries, right?  because probably end user is
able to specify the repl factor to a bigger value than the default 3.  anyway the deadNodesRefreshIntervalMs
parameter just a shortcut optimization tip.
2) "if the node is still down, you may wait a long time before we can try another live node,
when happens, this increases io latency a lot",  in current trunk code, we have some configurable
parameter to control the retry caused latency, right ? :) 

> Have an ability to refresh deadNodes list periodically
> ------------------------------------------------------
>                 Key: HDFS-5917
>                 URL: https://issues.apache.org/jira/browse/HDFS-5917
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 2.2.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>         Attachments: HDFS-5917.txt
> In current HBase + HDFS trunk impl, if one node is added into deadNodes map, before deadNodes.clear()
be invoked, this node could not be chosen any more. When i fixed HDFS-5637, i had a raw thought,
since there're not a few conditions could trigger a node be added into deadNodes map,  it
would be better if we have an ability to refresh this cache map info automaticly. It's good
for HBase scenario at least, e.g. before HDFS-5637 fixed, if a local node be added into deadNodes,
then it will read remotely even if the local node is live in real:) if more unfortunately,
this block is in a huge HFile which doesn't be picked into any minor compaction in short period,
the performance penality will be continued until a large compaction or region reopend or deadNodes.clear()
be invoked...

This message was sent by Atlassian JIRA

View raw message