hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthik Kambatla <ka...@cloudera.com>
Subject Re: [VOTE] Merge YARN-1197 container resize into trunk
Date Wed, 23 Sep 2015 22:01:18 GMT
I did skim through the patch before voting :)

As I mentioned earlier, I understand vast majority of the code is net new,
and the likelihood of breaking changes is low. Still, I would like for us
to be more careful and run some end-to-end and regression tests before
including this in a release. While I won't block the merge to branch-2,
will be a -1 on cutting a release without due testing.

At the risk of digressing, I also feel branch-2 should be in a releasable
state more often than not. The only way to bring any cadence and
predictability to our releases is by maintaining branch-2 in such a state.

On Tue, Sep 22, 2015 at 8:14 PM, Wangda Tan <wheeleast@gmail.com> wrote:

> Hi Karthik,
>
> Let me elaborate more to make you feel better of this change, don't be
> scared by the size of the patch :)
>
> Common RM/Scheduler part:
> - AbstractYarnScheduler new logic only.
> - AppSchedulingInfo new logic only.
> - RMContainer / RMNode state machine, new logic only
> - SchedulerApplicationAttempt / Allocation, refactoring to existing
> reservation logic so increase request reservation can reuse it, and
> refactored to simply updating container token / pull container part so
> increase/decrease/new-allocation can reuse same code.
>
> FairScheduler:
> - Small change since we updated how to pull container updated token. I
> believe it will be a straightforward change for you if you take a closer
> look at it.
>
> CapacityScheduler:
> - Most changes are separate logic or small refactorings, most complex
> allocation logic stays within IncreaseContainerAllocator.java.
>
> Please let me know where you want to get more details of implementations.
>
> I strongly suggest you to take a glance at the diff, we have already worked
> on the merge for the past one week, and we've paid a lot of extra time to
> keep YARN-1197 sync with trunk in the past several months. After this merge
> finished, a couple of weeks needed to finish end-to-end test and some other
> extra tests, it won't affect our upcoming branch-2 release.
>
> I would not prefer to merge to trunk only, all people working on RM side
> will be affected, we're very carefully avoid such divergence of RM in
> trunk/branch-2. Since nobody wants to create two different patches for
> every RM changes. And also, after this finished, other efforts can happen
> in parallel such as YARN-4091.
>
> Let me know if you have any other questions/concerns.
>
> Thanks,
> Wangda
>
> On Tue, Sep 22, 2015 at 7:26 PM, Karthik Kambatla <kasha@cloudera.com>
> wrote:
>
> > I am sorry, but merging a potentially disruptive change to branch-2
> without
> > end-to-end tests seems too disruptive to me.
> >
> > I do agree with you on the potential inconvenience of having to post
> > different patches for trunk and branch-2, but I would rather have that
> > inconvenience than the risk of merging something that hasn't been
> > thoroughly tested.
> >
> > On Tue, Sep 22, 2015 at 6:18 PM, Wangda Tan <wheeleast@gmail.com> wrote:
> >
> > > Hi Karthik,
> > >
> > > Thanks for comments! However, I think only merge to trunk may not work,
> > > this patch involves thousands lines of code changes in scheduler side,
> > only
> > > putting that to trunk could lead to trunk/branch-2 totally incompatible
> > for
> > > resource manager. I think most of the code changes are new to scheduler
> > > instead of modifying existed logic, they're not very tricky to me. And
> > when
> > > 2.8 will be released is not planned yet, at least we have a couple of
> > > months to make sure this feature becomes usable and not cause existing
> > > behavior regressions.
> > >
> > > Sounds good to you?
> > >
> > > Wangda
> > >
> > >
> > > On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla <kasha@cloudera.com>
> > > wrote:
> > >
> > > > +1 on merging to trunk. It would be nice to have some amount of
> testing
> > > > done before the merge, but I understand how merging to trunk would
> > likely
> > > > speed up the testing efforts.
> > > >
> > > > Let us not merge into branch-2 until after we have done a fair bit of
> > > > testing, and are comfortable including it in a release. While the
> code
> > > > mostly appears to not mess with existing scheduling logic, I am
> > concerned
> > > > about regressions to existing scheduling behavior.
> > > >
> > > >
> > > >
> > > > On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla <
> kasha@cloudera.com>
> > > > wrote:
> > > >
> > > > > By the way, for the purposes of merge vote, I believe a committer's
> > > vote
> > > > > is binding. So, Wangda and Zhihai's votes should be binding. :)
> > > > >
> > > > > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <zxu@cloudera.com>
> > wrote:
> > > > >
> > > > >> +1 (non-binding)
> > > > >>
> > > > >> thanks
> > > > >> Zhihai Xu
> > > > >>
> > > > >> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <
> xgong@hortonworks.com>
> > > > >> wrote:
> > > > >>
> > > > >> > +1 Binding
> > > > >> >
> > > > >> > Thanks
> > > > >> >
> > > > >> > Xuan Gong
> > > > >> >
> > > > >> > > On Sep 22, 2015, at 12:03 AM, Junping Du <jdu@hortonworks.com
> >
> > > > wrote:
> > > > >> > >
> > > > >> > > +1. (Binding).
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > >
> > > > >> > > Junping
> > > > >> > > ________________________________________
> > > > >> > > From: Wangda Tan <wheeleast@gmail.com>
> > > > >> > > Sent: Thursday, September 17, 2015 3:19 AM
> > > > >> > > To: yarn-dev@hadoop.apache.org
> > > > >> > > Subject: Re: [VOTE] Merge YARN-1197 container resize
into
> trunk
> > > > >> > >
> > > > >> > > +1 (non-binding),
> > > > >> > >
> > > > >> > > Thanks Jian starting this thread. This can minimize
effort of
> > > works
> > > > >> > across branches.
> > > > >> > >
> > > > >> > > To clarify, this feature is end-to-end code  completed,
we
> have
> > > API,
> > > > >> > rm/nm implementations patches committed, but we haven't
tested
> it
> > > > >> > end-to-end. Filed YARN-4175 to create an example program
to test
> > it
> > > > >> > end-to-end.
> > > > >> > >
> > > > >> > > Regards,
> > > > >> > > Wangda
> > > > >> > >
> > > > >> > >> On Sep 16, 2015, at 6:30 PM, Jian He <jhe@hortonworks.com>
> > > wrote:
> > > > >> > >>
> > > > >> > >> Hi All,
> > > > >> > >>
> > > > >> > >> Thanks Meng Ding and Wangda Tan for all the hard
work !
> > > > >> > >>
> > > > >> > >> I would like to call a vote to merge YARN-1197
container
> resize
> > > > into
> > > > >> > trunk.
> > > > >> > >>
> > > > >> > >> Key idea:
> > > > >> > >> This feature adds the ability for AM to change
container
> > resource
> > > > >> size
> > > > >> > at runtime.
> > > > >> > >>
> > > > >> > >> Details:
> > > > >> > >> - This feature is tracked at
> > > > >> > https://issues.apache.org/jira/browse/YARN-1197
> > > > >> > >> - It’s currently developed at a separate branch:
> > > > >> > https://github.com/apache/hadoop/commits/YARN-1197
> > > > >> > >> - A uber patch(
> https://issues.apache.org/jira/browse/YARN-4157
> > )
> > > > >> > generated from YARN-1197 to run against trunk  shows all
unit
> > tests
> > > > have
> > > > >> > passed.
> > > > >> > >> - This feature now can work end-to-end.
> > > > >> > >> -  All the unresolved jiras under YARN-1197 will
be the next
> > > step.
> > > > >> > >>
> > > > >> > >> Thanks,
> > > > >> > >> Wangda Tan & Meng Ding & Jian He
> > > > >> > >
> > > > >> > >
> > > > >> >
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message