hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter D Kirchner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3020) n similar addContainerRequest()s produce n*(n+1)/2 containers
Date Fri, 30 Jan 2015 20:28:36 GMT

    [ https://issues.apache.org/jira/browse/YARN-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299184#comment-14299184

Peter D Kirchner commented on YARN-3020:

That expected usage you describe, and current implementation contains a basic synchronization
The client application's RPC updates requests to the RM before it receives the containers
newly assigned during that heartbeat.
Therefore, if (as is currently the case) the client calculates the total requests, the total
is too large by at least the number of matching incoming assignments.
Per expected usage and current implementation, both add and remove cause this obsolete, too-high
total to be sent.
Cause or coincidence, I see applications (including but not limited to distributedShell) making
matching requests in a short interval and never calling remove.
They receive the behavior they need, or closer to it, than the expected usage would produce.

Further, in this API implementation/expected usage the remove API tries to serve two purposes
that are similar but not identical: to update the client-side bookkeeping and to identify
the request data to be sent to the server.  The problem here is that if there are only removes
for allocated containers, then the server-side bookkeeping is correct until the client sends
the total.  The removes called for incoming assigned containers should not be forwarded to
the RM until there is at least one matching add, or a bona-fide removal of a previously add-ed

I suppose the current implementation could be defended because its error is:
	1) "only" too high by the number of matching incoming assignments,
	2) persists "only" for the number of heartbeats it takes to clear the out of sync condition
	3) results in spurious allocations "only" once the application's intentional matching requests
were granted.
I maintain that spurious allocations are worst-case and especially damaging if obtained by

I want to suggest an alternative that is simpler and accurate, and limited to the AMRMClient
and RM. The fact that the scheduler is updated by replacement informs the choice of where
Yarn should calculate that total for a matching request.
The client is in a position to accurately calculate how much its current wants differ from
what it has asked for over its life.
This suggests a fix to the synchronization problem by having the client send the net of add/remove
requests it has accumulated over a heartbeat cycle,
and having the RM update its totals, from the difference obtained from the client, using synchronized
(Note, this client would not ordinarily call remove when it received a container, as the scheduler
has already
properly accounted for it when it made the allocation).

> n similar addContainerRequest()s produce n*(n+1)/2 containers
> -------------------------------------------------------------
>                 Key: YARN-3020
>                 URL: https://issues.apache.org/jira/browse/YARN-3020
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
>            Reporter: Peter D Kirchner
>   Original Estimate: 24h
>  Remaining Estimate: 24h
> BUG: If the application master calls addContainerRequest() n times, but with the same
priority, I get up to 1+2+3+...+n containers = n*(n+1)/2 .  The most containers are requested
when the interval between calls to addContainerRequest() exceeds the heartbeat interval of
calls to allocate() (in AMRMClientImpl's run() method).
> If the application master calls addContainerRequest() n times, but with a unique priority
each time, I get n containers (as I intended).
> Analysis:
> There is a logic problem in AMRMClientImpl.java.
> Although AMRMClientImpl.java, allocate() does an ask.clear() , on subsequent calls to
addContainerRequest(), addResourceRequest() finds the previous matching remoteRequest and
increments the container count rather than starting anew, and does an addResourceRequestToAsk()
which defeats the ask.clear().
> From documentation and code comments, it was hard for me to discern the intended behavior
of the API, but the inconsistency reported in this issue suggests one case or the other is
implemented incorrectly.

This message was sent by Atlassian JIRA

View raw message