[ https://issues.apache.org/jira/browse/YARN-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14094998#comment-14094998
]
Jian He commented on YARN-2378:
-------------------------------
[~subru], thanks for the patch ! some comments:
- We may ignore Move at NEW_SAVING state, because Client app submission is generally not considered
succeeded until it is saved. see YarnClient#submitApplication
{code}
.addTransition(RMAppState.NEW_SAVING, RMAppState.NEW_SAVING,
RMAppEventType.MOVE, new RMAppMoveTransition())
{code}
- use AbstractYarnScheduler#getApplicationAttempt
{code}
FiCaSchedulerApp app =
(FiCaSchedulerApp) applications.get(appId).getCurrentAppAttempt();
{code}
- getCheckLeafQueue: how about renaming to getAndCheckLeafQueue
- ParentQueue#addApplication: seems moving one leafQueue to another within the same parent
queue will cause numApplications of the parentQueue to increase. (can you add test for this
if I'm right..)
- containers are not re-reserved in CapacityScheduler, but re-reserved in SchedulerApplicationAttempt.
should we re-reserve the containers ?
{code}
for (Map<NodeId, RMContainer> map : reservedContainers.values()) {
for (RMContainer reservedContainer : map.values()) {
Resource resource = reservedContainer.getReservedResource();
oldMetrics.unreserveResource(user, resource);
newMetrics.reserveResource(user, resource);
}
}
{code}
- concerned about accessing parent queue while holding childQueue's lock will cause deadlock.
probably use synchronized block to protect the metrics to be updated.
{code}
synchronized public void detachContainer(Resource clusterResource,
FiCaSchedulerApp application, RMContainer rmContainer) {
if (application != null) {
releaseResource(clusterResource, application, rmContainer.getContainer()
.getResource());
LOG.info("movedContainer" + " container=" + rmContainer.getContainer()
+ " resource=" + rmContainer.getContainer().getResource()
+ " queueMoveOut=" + this + " usedCapacity=" + getUsedCapacity()
+ " absoluteUsedCapacity=" + getAbsoluteUsedCapacity() + " used="
+ usedResources + " cluster=" + clusterResource);
// Inform the parent queue
getParent().detachContainer(clusterResource, application, rmContainer);
}
}
{code}
> Adding support for moving apps between queues in Capacity Scheduler
> -------------------------------------------------------------------
>
> Key: YARN-2378
> URL: https://issues.apache.org/jira/browse/YARN-2378
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: capacityscheduler
> Reporter: Subramaniam Venkatraman Krishnan
> Assignee: Subramaniam Venkatraman Krishnan
> Labels: capacity-scheduler
> Attachments: YARN-2378.patch, YARN-2378.patch, YARN-2378.patch
>
>
> As discussed with [~leftnoteasy] and [~jianhe], we are breaking up YARN-1707 to smaller
patches for manageability. This JIRA will address adding support for moving apps between queues
in Capacity Scheduler.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
|