hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen Liang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-11507) NetworkTopology#chooseRandom may run into a dead loop due to race condition
Date Tue, 07 Mar 2017 01:15:33 GMT
Chen Liang created HDFS-11507:
---------------------------------

             Summary: NetworkTopology#chooseRandom may run into a dead loop due to race condition
                 Key: HDFS-11507
                 URL: https://issues.apache.org/jira/browse/HDFS-11507
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: namenode
            Reporter: Chen Liang
            Assignee: Chen Liang


{{NetworkTopology#chooseRandom()}} works as:
1. counts the number of available nodes as {{availableNodes}},
2. checks how many nodes are excluded, deduct from {{availableNodes}}
3. if {{availableNodes}} still > 0, then there are nodes available.
4. keep looping to find that node

But now imagine, in the meantime, the actually available nodes got removed in step 3 or step
4, and all remaining nodes are excluded nodes. Then, although there are no more nodes actually
available, the code would still run as {{availableNodes}} > 0, and then it would keep getting
excluded node and loop forever, as 
{{if (excludedNodes == null || !excludedNodes.contains(ret))}} 
will always be false.

We may fix this by expanding the while loop to also include the {{availableNodes}} calculation.
Such that we re-calculate {{availableNodes}} every time it fails to find an available node.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message