hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhangyubiao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java
Date Fri, 25 Nov 2016 08:01:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15695182#comment-15695182
] 

zhangyubiao commented on YARN-4090:
-----------------------------------

Found one Java-level deadlock:
=============================
"IPC Server handler 98 on 8032":
  waiting to lock monitor 0x00007f4e48b1f808 (object 0x00007f42e17a5ed8, a org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue),
  which is held by "IPC Server handler 76 on 8032"
"IPC Server handler 76 on 8032":
  waiting to lock monitor 0x00007f4e388b94f8 (object 0x00007f42df3e8450, a org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue),
  which is held by "ResourceManager Event Processor"
"ResourceManager Event Processor":
  waiting to lock monitor 0x00007f4e48b1f808 (object 0x00007f42e17a5ed8, a org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue),
  which is held by "IPC Server handler 76 on 8032"

Java stack information for the threads listed above:
===================================================
"IPC Server handler 98 on 8032":
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.getQueueUserAclInfo(FSParentQueue.java:149)
	- waiting to lock <0x00007f42e17a5ed8> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.getQueueUserAclInfo(FairScheduler.java:1468)
	at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:903)
	at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:280)
	at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:431)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
"IPC Server handler 76 on 8032":
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.getQueueUserAclInfo(FSParentQueue.java:149)
	- waiting to lock <0x00007f42df3e8450> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.getQueueUserAclInfo(FSParentQueue.java:156)
	- locked <0x00007f42e17a5ed8> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.getQueueUserAclInfo(FairScheduler.java:1468)
	at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:903)
	at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:280)
	at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:431)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
"ResourceManager Event Processor":
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.decResourceUsage(FSQueue.java:307)
	- waiting to lock <0x00007f42e17a5ed8> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.decResourceUsage(FSQueue.java:309)
	- locked <0x00007f42df3e8450> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.decResourceUsage(FSQueue.java:309)
	- locked <0x00007f42e0c7cf50> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.containerCompleted(FSAppAttempt.java:157)
	- locked <0x00007f42deaf9aa8> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:829)
	- eliminated <0x00007f42deaf8288> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:984)
	- locked <0x00007f42deaf8288> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1195)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:121)
	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:680)
	at java.lang.Thread.run(Thread.java:745)

Found 1 deadlock.

> Make Collections.sort() more efficient in FSParentQueue.java
> ------------------------------------------------------------
>
>                 Key: YARN-4090
>                 URL: https://issues.apache.org/jira/browse/YARN-4090
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: fairscheduler
>            Reporter: Xianyin Xin
>            Assignee: Xianyin Xin
>         Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, YARN-4090.001.patch,
YARN-4090.002.patch, YARN-4090.003.patch, sampling1.jpg, sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message