hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3020) n similar addContainerRequest()s produce n*(n+1)/2 containers
Date Tue, 20 Jan 2015 21:29:36 GMT

    [ https://issues.apache.org/jira/browse/YARN-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284451#comment-14284451

Wangda Tan commented on YARN-3020:

The expected usage of AMRMClient is (Thanks for input from [~hitesh] and [~jianhe]): When
you received newly allocated containers from RM, you should manually call {{removeContainerRequest}}
to remove pending container requests. AMRMClient itself will not automatically deduct #pendingContainerRequests.

The reason is, when a container allocated from RM, AMRMClient doesn't know the container allocated
from which ResourceRequest. You may think container has priority, capacity and resourceName,
so that AMRMClient can get ResourceRequest via {{getMatchingRequests}}. But it is possible
some applications may use the container for other propose (AMRMClient cannot understand application's
specific logic). So AM should call {{removeContainerRequest}} itself.

To improve this, I think 1) we need add this behavior to YARN doc -- people should better
understand how to use AMRMClient. And 2) maybe we should add a default implementation to deduct
pending resource requests by prioirty/resource-name/capacity of allocated containers automatically
(User can disable this default behavior, implement their own logic to deduct pending resource

Does this make sense to you?


> n similar addContainerRequest()s produce n*(n+1)/2 containers
> -------------------------------------------------------------
>                 Key: YARN-3020
>                 URL: https://issues.apache.org/jira/browse/YARN-3020
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
>            Reporter: Peter D Kirchner
>   Original Estimate: 24h
>  Remaining Estimate: 24h
> BUG: If the application master calls addContainerRequest() n times, but with the same
priority, I get up to 1+2+3+...+n containers = n*(n+1)/2 .  The most containers are requested
when the interval between calls to addContainerRequest() exceeds the heartbeat interval of
calls to allocate() (in AMRMClientImpl's run() method).
> If the application master calls addContainerRequest() n times, but with a unique priority
each time, I get n containers (as I intended).
> Analysis:
> There is a logic problem in AMRMClientImpl.java.
> Although AMRMClientImpl.java, allocate() does an ask.clear() , on subsequent calls to
addContainerRequest(), addResourceRequest() finds the previous matching remoteRequest and
increments the container count rather than starting anew, and does an addResourceRequestToAsk()
which defeats the ask.clear().
> From documentation and code comments, it was hard for me to discern the intended behavior
of the API, but the inconsistency reported in this issue suggests one case or the other is
implemented incorrectly.

This message was sent by Atlassian JIRA

View raw message