hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bibin A Chundatt (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3884) RMContainerImpl transition from RESERVED to KILL apphistory status not updated
Date Sat, 03 Dec 2016 04:59:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15717413#comment-15717413
] 

Bibin A Chundatt commented on YARN-3884:
----------------------------------------

[~varun_saxena]
Any more changes required??

> RMContainerImpl transition from RESERVED to KILL apphistory status not updated
> ------------------------------------------------------------------------------
>
>                 Key: YARN-3884
>                 URL: https://issues.apache.org/jira/browse/YARN-3884
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>         Environment: Suse11 Sp3
>            Reporter: Bibin A Chundatt
>            Assignee: Bibin A Chundatt
>              Labels: oct16-easy
>         Attachments: 0001-YARN-3884.patch, Apphistory Container Status.jpg, Elapsed Time.jpg,
Test Result-Container status.jpg, YARN-3884.0002.patch, YARN-3884.0003.patch, YARN-3884.0004.patch,
YARN-3884.0005.patch
>
>
> Setup
> ===============
> 1 NM 3072 16 cores each
> Steps to reproduce
> ===============
> 1.Submit apps  to Queue 1 with 512 mb 1 core
> 2.Submit apps  to Queue 2 with 512 mb and 5 core
> lots of containers get reserved and unreserved in this case 
> {code}
> 2015-07-02 20:45:31,169 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
container_e24_1435849994778_0002_01_000013 Container Transitioned from NEW to RESERVED
> 2015-07-02 20:45:31,170 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
Reserved container  application=application_1435849994778_0002 resource=<memory:512, vCores:5>
queue=QueueA: capacity=0.4, absoluteCapacity=0.4, usedResources=<memory:2560, vCores:21>,
usedCapacity=1.6410257, absoluteUsedCapacity=0.65625, numApps=1, numContainers=5 usedCapacity=1.6410257
absoluteUsedCapacity=0.65625 used=<memory:2560, vCores:21> cluster=<memory:6144,
vCores:32>
> 2015-07-02 20:45:31,170 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
Re-sorting assigned queue: root.QueueA stats: QueueA: capacity=0.4, absoluteCapacity=0.4,
usedResources=<memory:3072, vCores:26>, usedCapacity=2.0317461, absoluteUsedCapacity=0.8125,
numApps=1, numContainers=6
> 2015-07-02 20:45:31,170 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
assignedContainer queue=root usedCapacity=0.96875 absoluteUsedCapacity=0.96875 used=<memory:5632,
vCores:31> cluster=<memory:6144, vCores:32>
> 2015-07-02 20:45:31,191 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
container_e24_1435849994778_0001_01_000014 Container Transitioned from NEW to ALLOCATED
> 2015-07-02 20:45:31,191 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger:
USER=dsperf   OPERATION=AM Allocated Container        TARGET=SchedulerApp     RESULT=SUCCESS
 APPID=application_1435849994778_0001    CONTAINERID=container_e24_1435849994778_0001_01_000014
> 2015-07-02 20:45:31,191 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
Assigned container container_e24_1435849994778_0001_01_000014 of capacity <memory:512,
vCores:1> on host host-10-19-92-117:64318, which has 6 containers, <memory:3072, vCores:14>
used and <memory:0, vCores:2> available after allocation
> 2015-07-02 20:45:31,191 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
assignedContainer application attempt=appattempt_1435849994778_0001_000001 container=Container:
[ContainerId: container_e24_1435849994778_0001_01_000014, NodeId: host-10-19-92-117:64318,
NodeHttpAddress: host-10-19-92-117:65321, Resource: <memory:512, vCores:1>, Priority:
20, Token: null, ] queue=default: capacity=0.2, absoluteCapacity=0.2, usedResources=<memory:2560,
vCores:5>, usedCapacity=2.0846906, absoluteUsedCapacity=0.41666666, numApps=1, numContainers=5
clusterResource=<memory:6144, vCores:32>
> 2015-07-02 20:45:31,191 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
Re-sorting assigned queue: root.default stats: default: capacity=0.2, absoluteCapacity=0.2,
usedResources=<memory:3072, vCores:6>, usedCapacity=2.5016286, absoluteUsedCapacity=0.5,
numApps=1, numContainers=6
> 2015-07-02 20:45:31,191 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
assignedContainer queue=root usedCapacity=1.0 absoluteUsedCapacity=1.0 used=<memory:6144,
vCores:32> cluster=<memory:6144, vCores:32>
> 2015-07-02 20:45:32,143 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
container_e24_1435849994778_0001_01_000014 Container Transitioned from ALLOCATED to ACQUIRED
> 2015-07-02 20:45:32,174 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
Trying to fulfill reservation for application application_1435849994778_0002 on node: host-10-19-92-143:64318
> 2015-07-02 20:45:32,174 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
Reserved container  application=application_1435849994778_0002 resource=<memory:512, vCores:5>
queue=QueueA: capacity=0.4, absoluteCapacity=0.4, usedResources=<memory:3072, vCores:26>,
usedCapacity=2.0317461, absoluteUsedCapacity=0.8125, numApps=1, numContainers=6 usedCapacity=2.0317461
absoluteUsedCapacity=0.8125 used=<memory:3072, vCores:26> cluster=<memory:6144, vCores:32>
> 2015-07-02 20:45:32,174 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
Skipping scheduling since node host-10-19-92-143:64318 is reserved by application appattempt_1435849994778_0002_000001
> 2015-07-02 20:45:32,213 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
container_e24_1435849994778_0001_01_000014 Container Transitioned from ACQUIRED to RUNNING
> 2015-07-02 20:45:32,213 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
Null container completed...
> 2015-07-02 20:45:33,178 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
Trying to fulfill reservation for application application_1435849994778_0002 on node: host-10-19-92-143:64318
> 2015-07-02 20:45:33,178 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
Reserved container  application=application_1435849994778_0002 resource=<memory:512, vCores:5>
queue=QueueA: capacity=0.4, absoluteCapacity=0.4, usedResources=<memory:3072, vCores:26>,
usedCapacity=2.0317461, absoluteUsedCapacity=0.8125, numApps=1, numContainers=6 usedCapacity=2.0317461
absoluteUsedCapacity=0.8125 used=<memory:3072, vCores:26> cluster=<memory:6144, vCores:32>
> 2015-07-02 20:45:33,178 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
Skipping scheduling since node host-10-19-92-143:64318 is reserved by application appattempt_1435849994778_0002_000001
> 2015-07-02 20:45:33,704 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
Application application_1435849994778_0002 unreserved  on node host: host-10-19-92-143:64318
#containers=5 available=<memory:512, vCores:3> used=<memory:2560, vCores:13>,
currently has 0 at priority 20; currentReservation <memory:0, vCores:0>
> 2015-07-02 20:45:33,704 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
QueueA used=<memory:2560, vCores:21> numContainers=5 user=dsperf user-resources=<memory:2560,
vCores:21>
> 2015-07-02 20:45:33,710 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
completedContainer container=Container: [ContainerId: container_e24_1435849994778_0002_01_000013,
NodeId: host-10-19-92-143:64318, NodeHttpAddress: host-10-19-92-143:65321, Resource: <memory:512,
vCores:5>, Priority: 20, Token: null, ] queue=QueueA: capacity=0.4, absoluteCapacity=0.4,
usedResources=<memory:2560, vCores:21>, usedCapacity=1.6410257, absoluteUsedCapacity=0.65625,
numApps=1, numContainers=5 cluster=<memory:6144, vCores:32>
> 2015-07-02 20:45:33,710 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
completedContainer queue=root usedCapacity=0.9166667 absoluteUsedCapacity=0.9166667 used=<memory:5632,
vCores:27> cluster=<memory:6144, vCores:32>
> 2015-07-02 20:45:33,711 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
Re-sorting completed queue: root.QueueA stats: QueueA: capacity=0.4, absoluteCapacity=0.4,
usedResources=<memory:2560, vCores:21>, usedCapacity=1.6410257, absoluteUsedCapacity=0.65625,
numApps=1, numContainers=5
> 2015-07-02 20:45:33,711 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
Application attempt appattempt_1435849994778_0002_000001 released container container_e24_1435849994778_0002_01_000013
on node: host: host-10-19-92-143:64318 #containers=5 available=<memory:512, vCores:3>
used=<memory:2560, vCores:13> with event: KILL
> {code}
> *Impact:*
> In application history server the status get updated to -1000 (INVALID)
> but the end time not updated so Elapsed Time always changes.
> Please check the snapshot attached



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message