hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker
Date Tue, 26 Feb 2008 18:04:52 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12572581#action_12572581

Devaraj Das commented on HADOOP-1985:

Actually it is easier than that .. only one check is required without adding extra space per
a TIP. If the last element of the cache in question is the TIP that we are trying to insert,
we don't insert... Here is how it looks like - consider the earlier inner 'for' loop inside
createCache. The modified code there:

+          if (hostMaps == null) {
+            hostMaps = new ArrayList<TaskInProgress>();
+            cache.put(node, hostMaps);
+            hostMaps.add(maps[i]);
+          }
+          //check whether the hostMaps already contains an entry for a TIP
+          //This will be true for nodes that are racks and multiple nodes in
+          //the rack contain the input for a tip. Note that if it already
+          //exists in the hostMaps, it must be the last element there since
+          //we process one TIP at a time sequentially in the split-size order
+          if (hostMaps.get(hostMaps.size() - 1) != maps[i]) {
+            hostMaps.add(maps[i]);
+          }

> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.17.0
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v11.patch, 1985.v19.patch,
1985.v2.patch, 1985.v20.patch, 1985.v23.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch,
1985.v6.patch, 1985.v9.patch, jobinprogress.patch
> In order to implement switch locality in MapReduce, we need to have switch location in
both the namenode and job tracker.  Currently the namenode asks the data nodes for this info
and they run a local script to answer this question.  In our environment and others that I
know of there is no reason to push this to each node.  It is easier to maintain a centralized
script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and
invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.
 We can then add this to the namenode to support the current block to switch mapping needs
and simplify the data nodes.  We can also add this same callout to the job tracker and then
implement rack locality logic there without needing to chane the filesystem API or the split
planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also
future compatible to future infrastructures that may derive topology on the fly, etc, etc...

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message