hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiang Li (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-7100) Provide options to skip adding container request for data-local and rack-local respectively
Date Thu, 31 May 2018 13:33:00 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xiang Li updated MAPREDUCE-7100:
--------------------------------
    Description: 
We are using hadoop 2.7.3 and the computing layer is running out of the storage cluster (that
is, node managers are running on a different set of nodes from data nodes). The problem we
meet is that the container allocation is quite slow for some jobs.
After some debugging, we found that in org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq()
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
    // Create resource requests
    for (String host : req.hosts) {
      // Data-local
      if (!isNodeBlacklisted(host)) {
        addResourceRequest(req.priority, host, req.capability,
            null);
      }
    }

    // Nothing Rack-local for now
    for (String rack : req.racks) {
      addResourceRequest(req.priority, rack, req.capability,
          null);
    }

    // Off-switch
    addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
        req.nodeLabelExpression);
  }
{code}

It seem that the request of data-local and rack-local could be skipped when computing layer
is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If the mapper
is to read HDFS, req.hosts is the corresponding data node and req.racks is its rack. The debug
log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: applicationId=1
priority=20 resourceName=<data-node> numContainers=1 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: applicationId=1
priority=20 resourceName=<its rack> numContainers=1 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: applicationId=1
priority=20 resourceName=* numContainers=1 #asks=3
{code}
Although eventually, the resource request with resourceName=<data-node> will not be
satisfied (because the data node is not node manager), it could be better that if we know
that computing layer is not the same as the storage cluster, the request of data-node and
rack-local could be skipped (by options) in an earlier stage.



  was:
We are using hadoop 2.7.3 and the computing layer is running out of the storage cluster (that
is, node managers are running on a different set of nodes from data nodes). The problem we
meet is that the container allocation is quite slow for some jobs.
After some debugging, we found that in org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq()
(the following code is from trunk, not 2.7.3)
{code}
protected void addContainerReq(ContainerRequest req) {
    // Create resource requests
    for (String host : req.hosts) {
      // Data-local
      if (!isNodeBlacklisted(host)) {
        addResourceRequest(req.priority, host, req.capability,
            null);
      }
    }

    // Nothing Rack-local for now
    for (String rack : req.racks) {
      addResourceRequest(req.priority, rack, req.capability,
          null);
    }

    // Off-switch
    addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
        req.nodeLabelExpression);
  }
{code}

It seem that the request of data-local and rack-local could be skipped when computing layer
is not the same as the storage cluster.
If I get it correctly, req.hosts and req.racks are provided by InputFormat. If the mapper
is to read HDFS, req.hosts is the corresponding data node and req.racks is its rack. The debug
log of AM is like:
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: applicationId=1
priority=20 resourceName=<data-node> numContainers=1…256 #asks=1
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: applicationId=1
priority=20 resourceName=<its rack> numContainers=1…256 #asks=2
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: applicationId=1
priority=20 resourceName=* numContainers=1…256 #asks=3
{code}
Although eventually, the resource request with resourceName=<data-node> will not be
satisfied (because the data node is not node manager), it could be better that if we know
that computing layer is not the same as the storage cluster, the request of data-node and
rack-local could be skipped (by options) in an earlier stage.




> Provide options to skip adding container request for data-local and rack-local respectively
> -------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7100
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7100
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: applicationmaster
>            Reporter: Xiang Li
>            Priority: Minor
>
> We are using hadoop 2.7.3 and the computing layer is running out of the storage cluster
(that is, node managers are running on a different set of nodes from data nodes). The problem
we meet is that the container allocation is quite slow for some jobs.
> After some debugging, we found that in org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#addContainerReq()
(the following code is from trunk, not 2.7.3)
> {code}
> protected void addContainerReq(ContainerRequest req) {
>     // Create resource requests
>     for (String host : req.hosts) {
>       // Data-local
>       if (!isNodeBlacklisted(host)) {
>         addResourceRequest(req.priority, host, req.capability,
>             null);
>       }
>     }
>     // Nothing Rack-local for now
>     for (String rack : req.racks) {
>       addResourceRequest(req.priority, rack, req.capability,
>           null);
>     }
>     // Off-switch
>     addResourceRequest(req.priority, ResourceRequest.ANY, req.capability,
>         req.nodeLabelExpression);
>   }
> {code}
> It seem that the request of data-local and rack-local could be skipped when computing
layer is not the same as the storage cluster.
> If I get it correctly, req.hosts and req.racks are provided by InputFormat. If the mapper
is to read HDFS, req.hosts is the corresponding data node and req.racks is its rack. The debug
log of AM is like:
> {code}
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: applicationId=1
priority=20 resourceName=<data-node> numContainers=1 #asks=1
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: applicationId=1
priority=20 resourceName=<its rack> numContainers=1 #asks=2
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: addResourceRequest: applicationId=1
priority=20 resourceName=* numContainers=1 #asks=3
> {code}
> Although eventually, the resource request with resourceName=<data-node> will not
be satisfied (because the data node is not node manager), it could be better that if we know
that computing layer is not the same as the storage cluster, the request of data-node and
rack-local could be skipped (by options) in an earlier stage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message