Hello Benson

If I view "Maximum Application Master Resources" on the ResourceManager Web UI for QueueA, I should see 4096MB, correct?
> Yes. Are u seeing any different behavior? if so, please share cap-sched.xml.

shouldn't we be able to run 4 uber-mode jobs on QueueA without waiting or using preemption?
> Yes, it should be.

are you saying that 20% of 5GB is 1GB, so we can only run 1 uber-mode job even though 5GB is available?
> ideally no. we take max(queue capacity, available limit) * am-res-pcnt. 

- Sunil


On Tue, Feb 7, 2017 at 11:51 PM Benson Qiu <benson.qiu@salesforce.com> wrote:
Hi Sunil,

Thanks for your reply!

I have some follow up questions to make sure I fully understand the scenario you mentioned (QueueA has 50% capacity, 100% max-capacity, 20% maximum-am-resource-percent, cluster resource is 20GB, AM container size is 1GB, QueueB has taken over 15GB).

Adding on, lets assume the following:
- All jobs run in uber mode so we don't need to worry about additional resources for map and reduce containers.
- root.QueueA and root.QueueB are the only two queues on the cluster.
- user-limit-factor is high enough that a single user can use all of QueueA and QueueB's elasticity.

Some questions:
1. If I view "Maximum Application Master Resources" on the ResourceManager Web UI for QueueA, I should see 4096MB, correct? (QueueA elastically can use 100% of the 20GB cluster. 20% of 20GB = 4096MB).
2. At the current point in time when QueueB is using 15GB, QueueA has 5GB available. Since "Maximum Application Master Resources" is 4096MB, and 5GB is available, shouldn't we be able to run 4 uber-mode jobs on QueueA without waiting or using preemption? Or are you saying that 20% of 5GB is 1GB, so we can only run 1 uber-mode job even though 5GB is available?

Thanks,
Benson

On Mon, Feb 6, 2017 at 9:25 PM, Sunil Govind <sunil.govind@gmail.com> wrote:
Hello Benson

I could help to explain a little bit here.

maximum-am-resource-percent could be configured per-queue level (from next release, it could be configure per node-label level as well). By default 10% is default, and hence 10% of queue's capacity could be used for running AM resources. However due to elasticity, a queue could have resources above its configured capacity. In that case, "Max Application Master Resources" will be considering queue's max limit.

To answer your question, Yes. Ideally this resources is available for running AM. However there could many other reasons by which this resource may not be available for AM. To list a few, assume QueueA has 50% capacity and 100% as its max-capacity. AM resource percentage is 20%. Cluster resource is 20GB.
- Assume QueueB has taken over 15GB. And one app is running in QueueA with 1GB as AM resource. As per calculation 4GB could go to AM resource. However, we need to wait till some resource are freed from QueueB or use preemption.
- User limit. If user-limit-factor is <=1, then you may not be able to get more resources for elasticity.

If you tune all params as per your scenario, and if there are enough resources in cluster, you could avail this resource for AM.

Thanks
Sunil

On Tue, Feb 7, 2017 at 9:20 AM Benson Qiu <benson.qiu@salesforce.com> wrote:
Hi,

I noticed that "Max Application Master Resources" on the ResourceManager UI (/cluster/scheduler) takes into account queue elasticity.

AMResourceLimit and userAMResourceLimit on the ResourceManager API (/ws/v1/cluster/scheduler) also takes into account queue elasticity.

Are these AM resources always guaranteed? If a queue cannot grow because all of the other queues in the cluster are fully utilized, does the queue still have "Max Application Master Resources" available for AM containers?

Thanks,
Benson