hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HDDS-2199) In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host
Date Wed, 02 Oct 2019 10:21:00 GMT

     [ https://issues.apache.org/jira/browse/HDDS-2199?focusedWorklogId=321798&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321798
]

ASF GitHub Bot logged work on HDDS-2199:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/Oct/19 10:20
            Start Date: 02/Oct/19 10:20
    Worklog Time Spent: 10m 
      Work Description: elek commented on pull request #1551: HDDS-2199 In SCMNodeManager
dnsToUuidMap cannot track multiple DNs on the same host
URL: https://github.com/apache/hadoop/pull/1551#discussion_r330472747
 
 

 ##########
 File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMBlockProtocolServer.java
 ##########
 @@ -295,7 +297,33 @@ public ScmInfo getScmInfo() throws IOException {
     boolean auditSuccess = true;
     try{
       NodeManager nodeManager = scm.getScmNodeManager();
-      Node client = nodeManager.getNodeByAddress(clientMachine);
 
 Review comment:
   I am trying to understand why this big block is not just as simple:
   
   ```
         Node client = null;
         List<DatanodeDetails> possibleClients =
             nodeManager.getNodesByAddress(clientMachine);
         if (possibleClients.size()>0){
           client = possibleClients.get(0);
         }
   ```
   
   It seems to be a logic to find a datanode which is on the same host as the client. I am
not sure if we need this tricky randomization (or choosing the first possible datanodes):
if client is null, we don't need sort (handled by the sort method below), if there are multiple
datanodes on the same client we can choose the first one as in the topology sort it doesn't
matter which one is chosen.
   
   But please fix me if I am wrong.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 321798)
    Time Spent: 2h  (was: 1h 50m)

> In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host
> -------------------------------------------------------------------------
>
>                 Key: HDDS-2199
>                 URL: https://issues.apache.org/jira/browse/HDDS-2199
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>    Affects Versions: 0.5.0
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> Often in test clusters and tests, we start multiple datanodes on the same host.
> In SCMNodeManager.register() there is a map of hostname -> datanode UUID called dnsToUuidMap.
> If several DNs register from the same host, the entry in the map will be overwritten
and the last DN to register will 'win'.
> This means that the method getNodeByAddress() does not return the correct DatanodeDetails
object when many hosts are registered from the same address.
> This method is only used in SCMBlockProtocolServer.sortDatanodes() to allow it to see
if one of the nodes matches the client, but it need to be used by the Decommission code.
> Perhaps we could change the getNodeByAddress() method to returns a list of DNs? In normal
production clusters, there should only be one returned, but in test clusters, there may be
many. Any code looking for a specific DN entry would need to iterate the list and match on
the port number too, as host:port would be the unique definition of a datanode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message