hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HADOOP-2060) DFSClient should choose a block that is local to the node where the client is running
Date Fri, 02 Nov 2007 13:43:51 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-2060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Runping Qi resolved HADOOP-2060.
--------------------------------

    Resolution: Invalid


The suggested feature is already in.


> DFSClient should choose a block that is local to the node where the client is running
> -------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2060
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2060
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Runping Qi
>
> When I chase down the DFSClient code to see how the data locality impact the dfs read
throughput,
> I realized that DFSClient does not use data locality info (at least not obvious to me)

> when it chooses a block for read from the available replicas.
> Here is the relevant code:
> {code}
>  /**
>    * Pick the best node from which to stream the data.
>    * Entries in <i>nodes</i> are already in the priority order
>    */
>   private DatanodeInfo bestNode(DatanodeInfo nodes[], 
>                                 AbstractMap<DatanodeInfo, DatanodeInfo> deadNodes)
>                                 throws IOException {
>     if (nodes != null) { 
>       for (int i = 0; i < nodes.length; i++) {
>         if (!deadNodes.containsKey(nodes[i])) {
>           return nodes[i];
>         }
>       }
>     }
>     throw new IOException("No live nodes contain current block");
>   }
>     private DNAddrPair chooseDataNode(LocatedBlock block)
>       throws IOException {
>       int failures = 0;
>       while (true) {
>         DatanodeInfo[] nodes = block.getLocations();
>         try {
>           DatanodeInfo chosenNode = bestNode(nodes, deadNodes);
>           InetSocketAddress targetAddr = DataNode.createSocketAddr(chosenNode.getName());
>           return new DNAddrPair(chosenNode, targetAddr);
>         } catch (IOException ie) {
>           String blockInfo = block.getBlock() + " file=" + src;
>           if (failures >= MAX_BLOCK_ACQUIRE_FAILURES) {
>             throw new IOException("Could not obtain block: " + blockInfo);
>           }
>           
>           if (nodes == null || nodes.length == 0) {
>             LOG.info("No node available for block: " + blockInfo);
>           }
>           LOG.info("Could not obtain block " + block.getBlock() + " from any node:  "
+ ie);
>           try {
>             Thread.sleep(3000);
>           } catch (InterruptedException iex) {
>           }
>           deadNodes.clear(); //2nd option is to remove only nodes[blockId]
>           openInfo();
>           failures++;
>           continue;
>         }
>       }
>     } 
> {code}
> It seems to pick the first good replica. 
> This means that even though some replica is local  to the node where the client runs,
> it may actually pick a remote replica. 
> Map/reduce tries to schedule a mapper to a node where some copy of the input split data
is local to the node.
> However, if the DFSClient does not use that info in choosing replica for  read, the mapper
may well have to read data 
> from the network, even though a local replica is available.
> I hope I missed something and misunderstood the code.
> Otherwise, this will be a serious problem to performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message