hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thanh Do (JIRA)" <j...@apache.org>
Subject [jira] Created: (HDFS-1384) NameNode should give client the first node in the pipeline from different rack other than that of excludedNodes list in the same rack.
Date Wed, 08 Sep 2010 02:01:34 GMT
NameNode should give client the first node in the pipeline from different rack  other than
that of excludedNodes list in the same rack.
---------------------------------------------------------------------------------------------------------------------------------------

                 Key: HDFS-1384
                 URL: https://issues.apache.org/jira/browse/HDFS-1384
             Project: Hadoop HDFS
          Issue Type: Bug
    Affects Versions: 0.20.1
            Reporter: Thanh Do


We saw a case that NN keeps giving client nodes from the same rack, hence an exception 
from client when try to setup the pipeline. Client retries 5 times and fails.
 
Here is more details. Support we have 2 rack
- Rack 0: from dn1 to dn7
- Rack 1: from dn8 to dn14

Client asks for 3 dns and NN replies with dn1, dn8 and dn9, for example.
Because there is network partition, so client doesn't see any node in Rack 0.
Hence, client add dn1 to excludedNodes list, and ask NN again.
Interestingly, NN picks a different node (from those in excludedNodes) in Rack 0, 
and gives back to client, and so on. Client keeps retrying and after 5 times of retrials,

write fails.

This bug was found by our Failure Testing Service framework:
http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and 
Haryadi Gunawi (haryadi@eecs.berkeley.edu)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message