Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 18D7EFA69 for ; Thu, 11 Apr 2013 00:15:31 +0000 (UTC) Received: (qmail 2091 invoked by uid 500); 11 Apr 2013 00:15:19 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 2062 invoked by uid 500); 11 Apr 2013 00:15:19 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 2017 invoked by uid 99); 11 Apr 2013 00:15:19 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Apr 2013 00:15:19 +0000 Date: Thu, 11 Apr 2013 00:15:19 +0000 (UTC) From: "Carlo Curino (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628477#comment-13628477 ] Carlo Curino commented on YARN-45: ---------------------------------- This is still a point we are discussing and it is not fully binded, this is why is why it comes out confusing and why we were soliciting opinions. Your observations I think are helping us frame this a bit better. We can see three possible uses of preemption: 1) A preemption policy that does not necessarily trust the AM, picks containers and list them as a Set, and give the AM a heads up on who is going to die soon if it is not preempted. Note that If the AM is mapreduce this is not too bad as we know how containers are used (maps before reducers) and so we can pick containers in a reasonable order. We have been testing a policy that does this, and works well in our tests. Also this is a perfect match with how the FairScheduler thinks about preemption. 2) A preemption policy that trusts the AM and specifies preemption as a Set. This works well for known AMs that we know try to enforce the preemption requests, and/or if we do not care to force-killing anyway and preemption requests are best-effort. We have played around with a version of this too. If I am not mistaken this is also the case you care the most about, right? 3) A version of 2 which also enforces its preemption-requests via killing if they are not satisfied within a certain period of time. This is not-trivial to build as there is inherent ambiguity of how ResourceRequest are mapped to containers over-time, so the enforcement part is hard to get right / prove correctness for. We believe that 3 might be the ideal point of tendency but proving its correctness is non-trivial and would require deeper surgery to the RM/Schedulers, for example if in subsequent moment in time I want the same amount of resources out of an AM it is hard to unambiguously decide whether is due to an AM not preempting as I asked (just forcibly killing its containers is fine), or whether this are subsequent and independent request of resources (so I should not kill but wait). The proposed protocol, with the change that makes it a tagged union of Set and Set seems to allow for all of the above, and be easy to explain. I will update the patch to fix to reflect this if you agree. > Scheduler feedback to AM to release containers > ---------------------------------------------- > > Key: YARN-45 > URL: https://issues.apache.org/jira/browse/YARN-45 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager > Reporter: Chris Douglas > Assignee: Carlo Curino > Attachments: YARN-45.patch > > > The ResourceManager strikes a balance between cluster utilization and strict enforcement of resource invariants in the cluster. Individual allocations of containers must be reclaimed- or reserved- to restore the global invariants when cluster load shifts. In some cases, the ApplicationMaster can respond to fluctuations in resource availability without losing the work already completed by that task (MAPREDUCE-4584). Supplying it with this information would be helpful for overall cluster utilization [1]. To this end, we want to establish a protocol for the RM to ask the AM to release containers. > [1] http://research.yahoo.com/files/yl-2012-003.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira