hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bibin A Chundatt (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-5918) Opportunistic scheduling allocate request failure when NM lost
Date Tue, 22 Nov 2016 19:27:59 GMT

     [ https://issues.apache.org/jira/browse/YARN-5918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bibin A Chundatt updated YARN-5918:
-----------------------------------
    Attachment: YARN-5918.0002.patch

Node gets removed from scheduler while creating remote Node from least used Node. Attaching
patch after adding testcase to simulate the same.

> Opportunistic scheduling allocate request failure when NM lost
> --------------------------------------------------------------
>
>                 Key: YARN-5918
>                 URL: https://issues.apache.org/jira/browse/YARN-5918
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Bibin A Chundatt
>            Assignee: Bibin A Chundatt
>         Attachments: YARN-5918.0001.patch, YARN-5918.0002.patch
>
>
> Allocate request failure during Opportunistic container allocation when nodemanager is
lost 
> {noformat}
> 2016-11-20 10:38:49,011 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger:
USER=root     OPERATION=AM Released Container TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1479637990302_0002
   CONTAINERID=container_e12_1479637990302_0002_01_000006  RESOURCE=<memory:1024, vCores:1>
> 2016-11-20 10:38:49,011 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
Removed node docker2:38297 clusterResource: <memory:4096, vCores:8>
> 2016-11-20 10:38:49,434 WARN org.apache.hadoop.ipc.Server: IPC Server handler 7 on 8030,
call Call#35 Retry#0 org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.allocate from
172.17.0.2:51584
> java.lang.NullPointerException
>         at org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService.convertToRemoteNode(OpportunisticContainerAllocatorAMService.java:420)
>         at org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService.convertToRemoteNodes(OpportunisticContainerAllocatorAMService.java:412)
>         at org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService.getLeastLoadedNodes(OpportunisticContainerAllocatorAMService.java:402)
>         at org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService.allocate(OpportunisticContainerAllocatorAMService.java:236)
>         at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
>         at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:467)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:990)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:846)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:789)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2539)
> 2016-11-20 10:38:50,824 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
container_e12_1479637990302_0002_01_000002 Container Transitioned from RUNNING to COMPLETED
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message