hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-1985) Abstract node to switch mapping into a topology service class used by namenode and jobtracker
Date Tue, 04 Aug 2015 03:44:04 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Allen Wittenauer updated HADOOP-1985:
-------------------------------------
    Release Note: 
This issue introduces rack awareness for map tasks. It also moves the rack resolution logic
to the central servers - NameNode & JobTracker. The administrator can specify a loadable
class given by topology.node.switch.mapping.impl to specify the class implementing the logic
for rack resolution. The class must implement a method - resolve(List<String> names),
where names is the list of DNS-names/IP-addresses that we want resolved. The return value
is a list of resolved network paths of the form /foo/rack, where rack is the rackID where
the node belongs to and foo is the switch where multiple racks are connected, and so on. The
default implementation of this class is packaged along with hadoop and points to org.apache.hadoop.net.ScriptBasedMapping
and this class loads a script that can be used for rack resolution. The script location is
configurable. It is specified by topology.script.file.name and defaults to an empty script.
In the case where the script name is empty, /default-rack is returned for all dns-names/IP-addresses.
The loadable topology.node.switch.mapping.impl provides administrators fleixibilty to define
how their site's node resolution should happen.
For mapred, one can also specify the level of the cache w.r.t the number of levels in the
resolved network path - defaults to two. This means that the JobTracker will cache tasks at
the host level and at the rack level. 
Known issue: the task caching will not work with levels greater than 2 (beyond racks). This
bug is tracked in HADOOP-3296.

  was:
This issue introduces rack awareness for map tasks. It also moves the rack resolution logic
to the central servers - NameNode & JobTracker. The administrator can specify a loadable
class given by topology.node.switch.mapping.impl to specify the class implementing the logic
for rack resolution. The class must implement a method - resolve(List\<String\> names),
where names is the list of DNS-names/IP-addresses that we want resolved. The return value
is a list of resolved network paths of the form /foo/rack, where rack is the rackID where
the node belongs to and foo is the switch where multiple racks are connected, and so on. The
default implementation of this class is packaged along with hadoop and points to org.apache.hadoop.net.ScriptBasedMapping
and this class loads a script that can be used for rack resolution. The script location is
configurable. It is specified by topology.script.file.name and defaults to an empty script.
In the case where the script name is empty, /default-rack is returned for all dns-names/IP-addresses.
The loadable topology.node.switch.mapping.impl provides administrators fleixibilty to define
how their site's node resolution should happen.
For mapred, one can also specify the level of the cache w.r.t the number of levels in the
resolved network path - defaults to two. This means that the JobTracker will cache tasks at
the host level and at the rack level. 
Known issue: the task caching will not work with levels greater than 2 (beyond racks). This
bug is tracked in HADOOP-3296.


> Abstract node to switch mapping into a topology service class used by namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop Common
>          Issue Type: New Feature
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.17.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 1985.v11.patch, 1985.v19.patch,
1985.v2.patch, 1985.v20.patch, 1985.v23.patch, 1985.v24.patch, 1985.v25.patch, 1985.v3.patch,
1985.v4.patch, 1985.v5.patch, 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch location in
both the namenode and job tracker.  Currently the namenode asks the data nodes for this info
and they run a local script to answer this question.  In our environment and others that I
know of there is no reason to push this to each node.  It is easier to maintain a centralized
script that maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch mappings and
invokes a loadable class or a configurable system call to resolve unknown DNS to switch mappings.
 We can then add this to the namenode to support the current block to switch mapping needs
and simplify the data nodes.  We can also add this same callout to the job tracker and then
implement rack locality logic there without needing to chane the filesystem API or the split
planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, it is also
future compatible to future infrastructures that may derive topology on the fly, etc, etc...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message