hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthik Kambatla <ka...@cloudera.com>
Subject Re: Calling a merge vote for YARN-1051
Date Thu, 02 Oct 2014 23:17:59 GMT
If this vote is meant for all branches:

+1 to merge to trunk
+1 to merge to branch-2
+1 to merge to branch-2.6, provided we "label" this feature
experimental/alpha until the follow-up items are addressed.
-0 to unconditional merge to branch-2.6.

PS: We should decide on the way to communicate the stability of a feature.
May be, the new-feature notes in the release documentation should have this
label?



On Wed, Oct 1, 2014 at 6:23 PM, Karthik Kambatla <kasha@cloudera.com> wrote:

> +1. Nicely done, Subru and Carlo.
>
> I have been partially involved with the work, and have reviewed some of
> the patches. With some help from Subru and documentation from Carlo
> (thanks!), I was able to play with the reservation system. Verified the
> following:
> 1. Reservations can be made only for the amount of resources available for
> that queue.
> 2. Jobs submitted against a reservation run in the corresponding
> "reservation" queue, and jobs submitted to the same higher-level queue but
> not against a reservation run in the corresponding "default" queue.
> 3. The web-ui shows the reserved resources in a queue even when there are
> no apps running.
>
> There are a few follow-up items towards feature completeness, and I am
> okay with working on them post merge to trunk as planned.
> 1. Support for FairScheduler
> 2. Recover reservations on RM restart/failover
> 3. CLI and/or REST APIs to make reservations - this is very useful for
> testing
> 4. Documentation in the usual apt.vm format.
>
> Cheers!
> Karthik
>
>
>
>
> On Wed, Oct 1, 2014 at 1:29 PM, Wangda Tan <wheeleast@gmail.com> wrote:
>
>> +1 (non-binding),
>> Reviewed several patches related to scheduler side changes. As Jian
>> mentioned, this will not affect existing behavior.
>> Looking forward this feature will be used by more people. Thanks for Carlo
>> and Subru!
>>
>> Thanks,
>> Wangda
>>
>> On Wed, Oct 1, 2014 at 1:21 PM, Jian He <jhe@hortonworks.com> wrote:
>>
>> > +1,
>> >
>> > Carlo and Subru,  great job !  thanks for your contribution !
>> > I reviewed a couple of CapacityScheduler related patches, they are in
>> good
>> > shape. In the minimum, they are not affecting existing behavior. should
>> be
>> > safe to merge.
>> >
>> > Jian
>> >
>> >
>> > On Wed, Oct 1, 2014 at 2:46 AM, Thomas Jungblut <tjungblut@apache.org>
>> > wrote:
>> >
>> > > +1 (non-binding)
>> > > Thanks for adding this, really useful feature.
>> > >
>> > > On 30 September 2014 19:40, Chris Douglas <cdouglas@apache.org>
>> wrote:
>> > >
>> > > > +1
>> > > >
>> > > > Excellent work, Carlo and Subru. -C
>> > > >
>> > > > On Fri, Sep 26, 2014 at 11:50 AM, Carlo Curino <
>> ccurino@microsoft.com>
>> > > > wrote:
>> > > > > (Apologies if it is delivered twice.)
>> > > > >
>> > > > > YARN Devs,
>> > > > >
>> > > > > We propose to merge YARN-1051 development branch into trunk.
>> > > > >
>> > > > > Key Idea:
>> > > > > This work adds support for Reservations to YARN RM. The key idea
>> is
>> > to
>> > > > allow users to request dedicated access to resources (a
>> reservation),
>> > > ahead
>> > > > of time.
>> > > > > For example I can ask for "10 containers for 1 hour sometime
>> between
>> > > 4pm
>> > > > and 9pm today".  The RM keeps track of the accepted reservation by
>> > means
>> > > of
>> > > > > a Plan (think it as an agenda on how the  cluster resources will
>> be
>> > > > used), and performs admission control to guarantee that if a
>> > reservation
>> > > is
>> > > > accepted enough
>> > > > > resources are set aside to satisfy it.  We enforce the reservation
>> > > > promises by dynamically creating/resizing/removing queues at the
>> right
>> > > > time. This allows us
>> > > > > to leverage the existing schedulers for the actual container
>> > assignment
>> > > > and tracking. The key benefit is to expose to the scheduler
>> flexibility
>> > > of
>> > > > allocation, while
>> > > > > guaranteeing users predictable resource allocation.
>> > > > >
>> > > > > Status
>> > > > >
>> > > > > *         The work has been "broken down" into 14 subtasks (+3
>> > patches
>> > > > already committed to trunk for move/kill of apps). All the issues
>> have
>> > > been
>> > > > resolved.
>> > > > >
>> > > > > *         Jenkins +1 the patch (with the exception of one test
>> > failure
>> > > > which we did not introduce, which is tracked here:
>> > > > https://issues.apache.org/jira/browse/MAPREDUCE-6094)
>> > > > >
>> > > > > *         Simple integration with MapReduce:
>> > > > https://issues.apache.org/jira/browse/MAPREDUCE-6103
>> > > > >
>> > > > > *         The broken-down patches have been reviewed and +1ed
by
>> > Vinod
>> > > > Kumar Vavilapali, Jian He, Wangda Tan, Karthik Kambatla, and Chris
>> > > Douglas.
>> > > > Thanks to all of you for the thorough reviews!
>> > > > >
>> > > > > *         The current version has been rather thoroughly tested
by
>> > > > running it on our 250 machines research cluster for months (first
>> > > prototype
>> > > > was operational about a year ago) by:
>> > > > >
>> > > > > o   Running hundreds of thousands of job generate by a modified
>> > version
>> > > > of gridmix that exercise the reservations mechanism side-by-side
>> normal
>> > > > queues.
>> > > > >
>> > > > > o   To support our integration with the resource estimation
>> framework
>> > > > Perforator (
>> http://research.microsoft.com/pubs/178971/perforator.pdf).
>> > > > Kaushik and Dharmesh have been pounding the reservation system for
>> > their
>> > > > research for 3-4 months now, and helped us spot few bugs and iron
>> them
>> > > out.
>> > > > >
>> > > > > o   Code has been inspected/extended by 4-5 other researchers
>> which
>> > are
>> > > > exploring integration with other systems and extensions of our
>> > algorithms
>> > > > for "reservation placement".
>> > > > >
>> > > > > *         We have few ideas for follow-up extensions/improvements
>> are
>> > > > tracked by the umbrella JIRA
>> > > > https://issues.apache.org/jira/browse/YARN-2572
>> > > > >
>> > > > > Documents and Deliverables
>> > > > >
>> > > > > *         This work was accepted for publication to SoCC 2014
>> > > > (pre-camera ready version of the paper here):
>> > > >
>> > >
>> >
>> https://issues.apache.org/jira/secure/attachment/12671498/socc14-paper15.pdf
>> > > > >
>> > > > > *         Shorter design doc:
>> > > >
>> > >
>> >
>> https://issues.apache.org/jira/secure/attachment/12628330/YARN-1051-design.pdf
>> > > > >
>> > > > > *         Overall patch:
>> > > >
>> > >
>> >
>> https://issues.apache.org/jira/secure/attachment/12671361/YARN-1051.1.patch
>> > > > >
>> > > > > *         Per Karthik request we are preparing a small how-to
>> > document
>> > > > and example code/configuration tracked by
>> > > > https://issues.apache.org/jira/browse/YARN-2609
>> > > > >
>> > > > >
>> > > > > Credits
>> > > > > Myself and Subru did lots of the coding (hence the flow of patches
>> > from
>> > > > us), but this is a group effort that could have not been possible
>> > without
>> > > > the ideas and hard work of many other
>> > > > > folks in our research group (Microsoft-CISL). Major kudos to:
>> Chris
>> > > > Douglas, Sriram Rao, Raghu Ramakrishnan, and our intern Djellel
>> > Difallah.
>> > > > Also big thanks to the many folks in community  (Arun, Vinod,
>> > Alejandro,
>> > > > Bikas, Karthik, Sandy, Hitesh, Jakob, Mohammad, Mayank, Jason,
>> Bobby,
>> > and
>> > > > many more) that helped us shape our ideas and code with very
>> insightful
>> > > > feedback and comments.
>> > > > >
>> > > > > We expect the vote to run for the usual 7 days and will expire
at
>> > 12pm
>> > > > PDT on Oct 3. Please feel free to reach out to us if you have any
>> > > > questions/doubts.
>> > > > >
>> > > > > Cheers,
>> > > > > Carlo & Subru
>> > > > >
>> > > >
>> > >
>> >
>> > --
>> > CONFIDENTIALITY NOTICE
>> > NOTICE: This message is intended for the use of the individual or
>> entity to
>> > which it is addressed and may contain information that is confidential,
>> > privileged and exempt from disclosure under applicable law. If the
>> reader
>> > of this message is not the intended recipient, you are hereby notified
>> that
>> > any printing, copying, dissemination, distribution, disclosure or
>> > forwarding of this communication is strictly prohibited. If you have
>> > received this communication in error, please contact the sender
>> immediately
>> > and delete it from your system. Thank You.
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message