hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2069) Add cross-user preemption within CapacityScheduler's leaf-queue
Date Tue, 24 Jun 2014 08:04:24 GMT

    [ https://issues.apache.org/jira/browse/YARN-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041845#comment-14041845
] 

Wangda Tan commented on YARN-2069:
----------------------------------

Hi Mayank,
Thanks for you patch, I just looked at it, some comments on implementation,

*In ProportionalCapacityPreemptionPolicy,*
1) balanceUserLimitsinQueueForPreemption()
1.1, I think there's a bug when multiple applications under a same user (say Jim) in a queue,
and usage of Jim is over user-limit.
Any of Jim's applications will be tried to be preempted (total-resource-used-by-Jim - user-limit).
We should remember resourcesToClaimBackFromUser and initialRes for each user (not reset them
when handling each application)
And it's better to add test to make sure this behavior is correct.

1.2, Some debug logging should be removed like
{code}
LOG.info("MAYANK queue NAME "+ qT.leafQueue.getQueueName());
{code}

1.3, This check should be unnecessary
{code}
      if(userLimitforQueue == null){
       LOG.info("MAYANK USER LIMIT IS NULL"); 
      }
{code}

2) preemptFrom
I noticed this method will be called multiple times for a same application within a editSchedule()
call.
The reservedContainers will be calculated multiple times.
An alternative way to do this is to cache
{code}
    List<RMContainer> containers =
      new ArrayList<RMContainer>(app.getLiveContainers());
{code}
At the begining of getContainersToPreempt(), clone them to a set for these running containers.
And remove container from the set if we processed it.
Another benefit to do is eliminate checking like,
{code}
        Set<RMContainer> userContainers = userLimitContainers.get(app
            .getApplicationAttemptId());
        if (userContainers.contains(c)) {
          continue;
        }
{code}
And preemptFrom called by balanceUserLimitsinQueueForPreemption passed dumb skippedAMContainerlist/skippedAMSize
in. It's better to create a overloaded preemptFrom to make code can be easier read.

*In LeafQueue,*
1) I think it's better to remember user limit, no need to compute it every time, add a method
like getUserLimit() to leafQueue should be better.

*In beyond, questions about some undefined requirements,*
1) Should we preempt containers equally from users when there're multiple users beyond user-limit
in a queue?
2) Should we preempt containers equally from applications in a same user? (Heap-like data
structure maybe helpful to solve 1/2)
3) Should user-limit preemption be configurable?

The last is as [~sunilg] suggested,  your patch is based on YARN-2022, it's better to separate
your patch from YARN-2022 for easier review. You can point which patch in YARN-2022 you based
on. 

Thanks,
Wangda

> Add cross-user preemption within CapacityScheduler's leaf-queue
> ---------------------------------------------------------------
>
>                 Key: YARN-2069
>                 URL: https://issues.apache.org/jira/browse/YARN-2069
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacityscheduler
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Mayank Bansal
>         Attachments: YARN-2069-trunk-1.patch
>
>
> Preemption today only works across queues and moves around resources across queues per
demand and usage. We should also have user-level preemption within a queue, to balance capacity
across users in a predictable manner.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message