hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HDFS-1384) NameNode should give client the first node in the pipeline from different rack other than that of excludedNodes list in the same rack.
Date Wed, 30 Jul 2014 23:51:39 GMT

     [ https://issues.apache.org/jira/browse/HDFS-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Allen Wittenauer resolved HDFS-1384.
------------------------------------

    Resolution: Incomplete

Closing as stale.

> NameNode should give client the first node in the pipeline from different rack  other
than that of excludedNodes list in the same rack.
> ---------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1384
>                 URL: https://issues.apache.org/jira/browse/HDFS-1384
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.20.1, 0.20-append
>            Reporter: Thanh Do
>
> We saw a case that NN keeps giving client nodes from the same rack, hence an exception

> from client when try to setup the pipeline. Client retries 5 times and fails.
>  
> Here is more details. Support we have 2 rack
> - Rack 0: from dn1 to dn7
> - Rack 1: from dn8 to dn14
> Client asks for 3 dns and NN replies with dn1, dn8 and dn9, for example.
> Because there is network partition, so client doesn't see any node in Rack 0.
> Hence, client add dn1 to excludedNodes list, and ask NN again.
> Interestingly, NN picks a different node (from those in excludedNodes) in Rack 0, 
> and gives back to client, and so on. Client keeps retrying and after 5 times of retrials,

> write fails.
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and 
> Haryadi Gunawi (haryadi@eecs.berkeley.edu)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message