Return-Path: X-Original-To: apmail-hadoop-yarn-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 415EB17323 for ; Tue, 30 Sep 2014 18:41:04 +0000 (UTC) Received: (qmail 80963 invoked by uid 500); 30 Sep 2014 18:41:03 -0000 Delivered-To: apmail-hadoop-yarn-dev-archive@hadoop.apache.org Received: (qmail 80895 invoked by uid 500); 30 Sep 2014 18:41:03 -0000 Mailing-List: contact yarn-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-dev@hadoop.apache.org Delivered-To: mailing list yarn-dev@hadoop.apache.org Received: (qmail 80884 invoked by uid 99); 30 Sep 2014 18:41:03 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Sep 2014 18:41:03 +0000 Received: from mail-qa0-f50.google.com (mail-qa0-f50.google.com [209.85.216.50]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id AF1B71A0476 for ; Tue, 30 Sep 2014 18:40:56 +0000 (UTC) Received: by mail-qa0-f50.google.com with SMTP id x12so2921139qac.37 for ; Tue, 30 Sep 2014 11:40:59 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.140.108.200 with SMTP id j66mr79115894qgf.43.1412102459696; Tue, 30 Sep 2014 11:40:59 -0700 (PDT) Received: by 10.140.89.116 with HTTP; Tue, 30 Sep 2014 11:40:59 -0700 (PDT) In-Reply-To: <6d4e973cddc84a79b5e32dba92faa6c4@DFM-TK5MBX15-05.exchange.corp.microsoft.com> References: <6d4e973cddc84a79b5e32dba92faa6c4@DFM-TK5MBX15-05.exchange.corp.microsoft.com> Date: Tue, 30 Sep 2014 11:40:59 -0700 Message-ID: Subject: Re: Calling a merge vote for YARN-1051 From: Chris Douglas To: "yarn-dev@hadoop.apache.org" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable +1 Excellent work, Carlo and Subru. -C On Fri, Sep 26, 2014 at 11:50 AM, Carlo Curino wrot= e: > (Apologies if it is delivered twice.) > > YARN Devs, > > We propose to merge YARN-1051 development branch into trunk. > > Key Idea: > This work adds support for Reservations to YARN RM. The key idea is to al= low users to request dedicated access to resources (a reservation), ahead o= f time. > For example I can ask for "10 containers for 1 hour sometime between 4pm = and 9pm today". The RM keeps track of the accepted reservation by means of > a Plan (think it as an agenda on how the cluster resources will be used)= , and performs admission control to guarantee that if a reservation is acce= pted enough > resources are set aside to satisfy it. We enforce the reservation promis= es by dynamically creating/resizing/removing queues at the right time. This= allows us > to leverage the existing schedulers for the actual container assignment a= nd tracking. The key benefit is to expose to the scheduler flexibility of a= llocation, while > guaranteeing users predictable resource allocation. > > Status > > * The work has been "broken down" into 14 subtasks (+3 patches al= ready committed to trunk for move/kill of apps). All the issues have been r= esolved. > > * Jenkins +1 the patch (with the exception of one test failure wh= ich we did not introduce, which is tracked here: https://issues.apache.org/= jira/browse/MAPREDUCE-6094) > > * Simple integration with MapReduce: https://issues.apache.org/ji= ra/browse/MAPREDUCE-6103 > > * The broken-down patches have been reviewed and +1ed by Vinod Ku= mar Vavilapali, Jian He, Wangda Tan, Karthik Kambatla, and Chris Douglas. T= hanks to all of you for the thorough reviews! > > * The current version has been rather thoroughly tested by runnin= g it on our 250 machines research cluster for months (first prototype was o= perational about a year ago) by: > > o Running hundreds of thousands of job generate by a modified version o= f gridmix that exercise the reservations mechanism side-by-side normal queu= es. > > o To support our integration with the resource estimation framework Per= forator (http://research.microsoft.com/pubs/178971/perforator.pdf). Kaushik= and Dharmesh have been pounding the reservation system for their research = for 3-4 months now, and helped us spot few bugs and iron them out. > > o Code has been inspected/extended by 4-5 other researchers which are e= xploring integration with other systems and extensions of our algorithms fo= r "reservation placement". > > * We have few ideas for follow-up extensions/improvements are tra= cked by the umbrella JIRA https://issues.apache.org/jira/browse/YARN-2572 > > Documents and Deliverables > > * This work was accepted for publication to SoCC 2014 (pre-camera= ready version of the paper here): https://issues.apache.org/jira/secure/= attachment/12671498/socc14-paper15.pdf > > * Shorter design doc: https://issues.apache.org/jira/secure/attac= hment/12628330/YARN-1051-design.pdf > > * Overall patch: https://issues.apache.org/jira/secure/attachment= /12671361/YARN-1051.1.patch > > * Per Karthik request we are preparing a small how-to document an= d example code/configuration tracked by https://issues.apache.org/jira/brow= se/YARN-2609 > > > Credits > Myself and Subru did lots of the coding (hence the flow of patches from u= s), but this is a group effort that could have not been possible without th= e ideas and hard work of many other > folks in our research group (Microsoft-CISL). Major kudos to: Chris Doug= las, Sriram Rao, Raghu Ramakrishnan, and our intern Djellel Difallah. Also = big thanks to the many folks in community (Arun, Vinod, Alejandro, Bikas, = Karthik, Sandy, Hitesh, Jakob, Mohammad, Mayank, Jason, Bobby, and many mor= e) that helped us shape our ideas and code with very insightful feedback an= d comments. > > We expect the vote to run for the usual 7 days and will expire at 12pm PD= T on Oct 3. Please feel free to reach out to us if you have any questions/d= oubts. > > Cheers, > Carlo & Subru >