hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szilard Nemeth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7528) Resource types that use units need to be defined at RM level and NM level or when using small units you will overflow max_allocation calculation
Date Tue, 16 Jan 2018 13:45:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16327129#comment-16327129
] 

Szilard Nemeth commented on YARN-7528:
--------------------------------------

Tried several things to reproduce this issue without success, please see the summary of my
findings below.
 *Actually the only way I was able to reproduce is to put a custom resource type with a value
equals to {{Long.MAX_VALUE}} to the node-resources.xml, but it does not seem a valid configuration
for me, moreover it is different what this issue describes so that the default configuration
causes the overflow.*

Based on the exception message and the method signature of

{{org.apache.hadoop.yarn.util.UnitsConversionUtil.convert}}, {{fromUnit}} should be empty,
{{toUnit}} should be "m" and {{fromValue}} should equal to {{Long.MAX_VALUE}}.

Based on the stacktrace, the call to {{UnitsConversionUtil.convert}} comes from {{DominantResourceCalculator.normalize:444}},
this is the call:
{code:java}
long maximumValue = UnitsConversionUtil.convert(
          maximumResourceInformation.getUnits(),
          rResourceInformation.getUnits(),
          maximumResourceInformation.getValue());
{code}
>From these, I tried to track down how the call to {{maximumResourceInformation.getValue()}}
could return {{Long.MAX_VALUE}}.

Please see the steps I took by checking the original stack trace and the call hierarchy:
 1. {{FairScheduler.getNormalizedResource()}}: the 3rd parameter here is the {{getMaximumResourceCapability()}}
call.

2. {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler#getMaximumResourceCapability()}}
that calls to 
 {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker#getMaxAllowedAllocation}}

3. In the {{ClusterNodeTracker#getMaxAllowedAllocation}} method, with the default configuration,
the {{Resource}} being returned contains {{ResourceInformations}} where their value is maximized
by the {{maxAllocation}} field, so in theory, one element of this array (a value of a resource)
should be {{Long.MAX_VALUE}}.

4. The {{ClusterNodeTracker.maxAllocation}} field is updated in: {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker#updateMaxResources}}.
 Looking at the source of this method, the {{maxAllocation}} array takes its values from the
{{node.getTotalResource()}}, so I tried to track down how the {{SchedulerNode}}'s {{totalResource}}
could take this high value.
 Since all the implementation classes just calling the constructor of the abstract class,
I started to investigate towards this direction.

5. {{SchedulerNode}}'s constructor, relevant line:
{code:java}
this.totalResource = Resources.clone(node.getTotalCapability());
{code}
{{Node}} is an instance of {{RMNode}}, looking how {{RMNodeImpl.getTotalCapability()}} works,
checked where {{RMNodeImpl}} is created.

6. {{RMNodeImpl}} is created in: {{org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService#registerNodeManager}}
 Since the {{totalCapability}} field of {{RMNodeImpl}} is updated in many places, I tried
to filter the most relevant one, so just checked the constructor. 
 It would have been very hard to check every scenario where this field could be updated.
 Still in {{registerNodeManager}}, I saw that the {{totalCapability}} is set from the {{RegisterNodeManagerRequest}}
at the beginning of the method:
{code:java}
Resource capability = request.getResource();
{code}
This is the boundary of the {{ResourceManager}} because the {{RegisterNodeManagerRequest}}
is sent from NM to RM.

7. Checked where the {{RegisterNodeManagerRequest}} is created, found only one occurence:

 {{org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl#registerWithRM}} creates
the request.
 This is the call:
{code:java}
RegisterNodeManagerRequest.newInstance(nodeId, httpPort, totalResource,
              nodeManagerVersionId, containerReports, getRunningApplications(),
              nodeLabels, physicalResource);
{code}
The relevant field is the {{totalResource}}, so checked where this field is updated.

8. In {{org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl#serviceInit}}, {{totalResource}}
is updated with:
{code:java}
this.totalResource = NodeManagerHardwareUtils.getNodeResources(conf);
{code}
9. Looking at the implementation of {{NodeManagerHardwareUtils.getNodeResources()}}: 
 This is the method that reads from node-resources.xml and sets the node's total resources.
 I could not find a scenario where a value of any custom resource is set to {{Long.MAX_VALUE}}.
----
To be able to reproduce, I tried several combinations of resource settings with node-resources.xml,
I have always started the pi job via console.
 Example parameters:
{code:java}
 "pi -Dmapreduce.framework.name=yarn -Dmapreduce.map.resource.gpu=5000m 10 100".
{code}
Please note that sometimes I used different values for the -Dmapreduce.map.resource parameter.
 In all cases I used this resource-types.xml file:
{code:xml}
<configuration>
	<property>
	   <name>yarn.resource-types</name>
	   <value>gpu,fpga</value>
	</property>
</configuration>
{code}
The scenarios I tried:

*1. node-resources.xml: gpu defined as Long.MAX_VALUE --> exception same as in issue, hangs*
{code:xml}
<property>
   <name>yarn.nodemanager.resource-type.gpu</name>
   <value>9223372036854775807</value>
</property>
{code}
jobs parameter:
 A.) -Dmapreduce.map.resource.gpu=5000m --> hangs
 B.) -Dmapreduce.map.resource.fpga=5000m --> does not hang

2. node-resources.xml: no custom types defined --> no exception, does not hang

3. node-resources.xml: fpga defined with value 1
{code:xml}
<property>
   <name>yarn.nodemanager.resource-type.fpga</name>
   <value>1</value>
</property>
{code}
jobs parameter:
 A.) -Dmapreduce.map.resource.gpu=5000m --> does not hang
 B.) -Dmapreduce.map.resource.fpga=5000m --> does not hang

4. node-resources.xml: gpu defined with value 1
{code:xml}
<property>
   <name>yarn.nodemanager.resource-type.gpu</name>
   <value>1</value>
</property>
{code}
jobs parameter:
 A.) -Dmapreduce.map.resource.gpu=5000m --> does not hang
 B.) -Dmapreduce.map.resource.fpga=5000m --> does not hang

5. node-resources.xml: gpu defined without value
{code:xml}
<property>
   <name>yarn.nodemanager.resource-type.gpu</name>
</property>
{code}
jobs parameter:
 A.) -Dmapreduce.map.resource.gpu=5000m --> does not hang
 B.) -Dmapreduce.map.resource.fpga=5000m --> does not hang

> Resource types that use units need to be defined at RM level and NM level or when using
small units you will overflow max_allocation calculation
> ------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-7528
>                 URL: https://issues.apache.org/jira/browse/YARN-7528
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: documentation, resourcemanager
>    Affects Versions: 3.0.0
>            Reporter: Grant Sohn
>            Assignee: Szilard Nemeth
>            Priority: Major
>
> When the unit is not defined in the RM, the LONG_MAX default will overflow in the conversion
step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message