hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hong Tang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1191) JobInProgress.createCache() should not add unknown hosts to the host-to-rack location mapping.
Date Fri, 06 Nov 2009 09:51:32 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774256#action_12774256

Hong Tang commented on MAPREDUCE-1191:

Related code - JobInProgress.createCache:
  Map<Node, List<TaskInProgress>> createCache(
                         Job.RawSplit[] splits, int maxLevel) {
    Map<Node, List<TaskInProgress>> cache =
      new IdentityHashMap<Node, List<TaskInProgress>>(maxLevel);
    for (int i = 0; i < splits.length; i++) {
      String[] splitLocations = splits[i].getLocations();
      if (splitLocations.length == 0) {

      for(String host: splitLocations) {
        Node node = jobtracker.resolveAndAddToTopology(host); //< HERE host will always
be added to internal hostnamesToNodeMap
        LOG.info("tip:" + maps[i].getTIPId() + " has split on node:" + node);
        for (int j = 0; j < maxLevel; j++) {
          List<TaskInProgress> hostMaps = cache.get(node);
          if (hostMaps == null) {
            hostMaps = new ArrayList<TaskInProgress>();
            cache.put(node, hostMaps);
          //check whether the hostMaps already contains an entry for a TIP
          //This will be true for nodes that are racks and multiple nodes in
          //the rack contain the input for a tip. Note that if it already
          //exists in the hostMaps, it must be the last element there since
          //we process one TIP at a time sequentially in the split-size order
          if (hostMaps.get(hostMaps.size() - 1) != maps[i]) {
          node = node.getParent();
    return cache;

> JobInProgress.createCache() should not add unknown hosts to the host-to-rack location
> ----------------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-1191
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1191
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Hong Tang
> JobInProgress.createCache() currently would add host names specified in rawsplits to
rack "/default-rack" if it does not already know the mapping. This seems to be a bad idea
in the sense that a malicious client can submit jobs with many maps whose locations are non-existent
hosts and thus consume up JobTracker's memory.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message