hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shengyang Sha (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-9195) RM Queue's pending container number might get decreased unexpectedly or even become negative once RM failover
Date Mon, 14 Jan 2019 07:41:00 GMT
Shengyang Sha created YARN-9195:

             Summary: RM Queue's pending container number might get decreased unexpectedly
or even become negative once RM failover
                 Key: YARN-9195
                 URL: https://issues.apache.org/jira/browse/YARN-9195
             Project: Hadoop YARN
          Issue Type: Bug
          Components: client
    Affects Versions: 3.1.0
            Reporter: Shengyang Sha
         Attachments: cases_to_recreate_negative_pending_requests_scenario.diff

Hi, all:

Previously we have encountered a serious problem in ResourceManager, we found that pending
container number of one RM queue became negative after RM failed over. Since queues in RM
are managed in hierarchical structure, the root queue's pending containers became negative
at last, thus the scheduling process of the whole cluster became affected.

The version of both our RM server and YARN client in our application are based on yarn 3.1,
and we uses AMRMClientAsync#addSchedulingRequests() methods in our application to request
resources from RM.

After investigation, we found that the direct cause was numAllocations of some AMs' requests
became negative after RM failed over. And there are at lease three necessary conditions:
(1) Use schedulingRequests in YARN client, and the application set zero to the numAllocations
for a schedulingRequest. In our batch job scenario, the numAllocations of a schedulingRequest
could turn to zero because theoretically we can run a full batch job using only one container.
(2) RM failovers.
(3) Before AM reregisters itself to RM after RM restarts, RM has already recovered some of
the application's containers assigned before.

Here are some more details about the implementation:
(1) After RM recovers, RM will send all alive containers to AM once it re-register itself
through RegisterApplicationMasterResponse#getContainersFromPreviousAttempts.
(2) During registerApplicationMaster, AMRMClientImpl will removeFromOutstandingSchedulingRequests
once AM gets ContainersFromPreviousAttempts without checking whether these containers have
been assigned before. As a consequence, its outstanding requests might be decreased unexpectedly
even if it may not become negative.
(3) There is no sanity check in RM to validate requests from AMs.

For better illustrating this case, I've written a test case based on the latest hadoop trunk,
posted in the attachment. You may try case testAMRMClientWithNegativePendingRequestsOnRMRestart
and testAMRMClientOnUnexpectedlyDecreasedPendingRequestsOnRMRestart .

To solve this issue, I propose to filter allocated containers before removeFromOutstandingSchedulingRequests
in AMRMClientImpl during registerApplicationMaster, and some sanity checks are also needed
to prevent things from getting worse.

More comments and suggestions are welcomed.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message