hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wangda Tan <wheele...@gmail.com>
Subject Re: [VOTE] Merge YARN-1197 container resize into trunk
Date Fri, 25 Sep 2015 17:23:32 GMT
Hi Karthik,
Will do that next time, thanks!

Regards,
Wangda

On Thu, Sep 24, 2015 at 4:56 PM, Karthik Kambatla <kasha@cloudera.com>
wrote:

> By the way, when we do merge votes, I was under the impression that we do
> two separate votes for trunk and branch-2. While we were discussing merging
> to branch-2 in this thread, the vote was only for trunk. I was hoping to
> exercise my opportunity to -0 on the branch-2 vote :)
>
> In the future, can we please stick to our usual approach of filing a vote
> for each branch please?
>
> On Thu, Sep 24, 2015 at 2:46 PM, Wangda Tan <wheeleast@gmail.com> wrote:
>
> > Thanks Chris for the suggestions,
> > I think we need a couple of weeks to finish end-to-end test, doing this
> is
> > majorly to avoid painful of resolving conflicts in the scheduler side.
> > In any case, will avoid doing this in the future.
> >
> > Regards,
> > Wangda
> >
> >
> > On Thu, Sep 24, 2015 at 2:29 PM, Chris Douglas <cdouglas@apache.org>
> > wrote:
> >
> > > Catching up on this thread. I'd missed Wangda's comment. This was
> > > merged without testing it end-to-end? And backported to branch-2?
> > >
> > > I share Karthik's reservations. This is a very aggressive merging
> > > strategy. Every other feature merged to branch-2 must also help
> > > stabilize YARN-1197. Going through the patch, the particular changes
> > > are not terribly threatening and the passing unit tests are
> > > encouraging. However, a few days' delay to test the feature is not
> > > asking much, particularly when you have reason to be confident in its
> > > stability.
> > >
> > > -0, after the fact. -C
> > >
> > > On Thu, Sep 24, 2015 at 11:39 AM, Wangda Tan <wheeleast@gmail.com>
> > wrote:
> > > > Trunk build seems stable now after the merge. A couple of yarn trunk
> > > builds
> > > > also ran fine such as:
> > > https://builds.apache.org/job/Hadoop-Yarn-trunk/1169/
> > > > .
> > > >
> > > > I've merged this into branch-2 so as to minimize YARN dev effort
> across
> > > > branches as per the discussion thread of YARN-1197 merge.
> > > >
> > > > Thanks,
> > > > Wangda
> > > >
> > > >
> > > > On Wed, Sep 23, 2015 at 3:15 PM, Wangda Tan <wheeleast@gmail.com>
> > wrote:
> > > >
> > > >> Thanks Karthik,
> > > >>
> > > >> Doing end-to-end test as well as regression test is the top priority
> > of
> > > >> this feature. Will keep you updated.
> > > >>
> > > >> Regards,
> > > >> Wangda
> > > >>
> > > >> On Wed, Sep 23, 2015 at 3:01 PM, Karthik Kambatla <
> kasha@cloudera.com
> > >
> > > >> wrote:
> > > >>
> > > >>> I did skim through the patch before voting :)
> > > >>>
> > > >>> As I mentioned earlier, I understand vast majority of the code
is
> net
> > > new,
> > > >>> and the likelihood of breaking changes is low. Still, I would
like
> > for
> > > us
> > > >>> to be more careful and run some end-to-end and regression tests
> > before
> > > >>> including this in a release. While I won't block the merge to
> > branch-2,
> > > >>> will be a -1 on cutting a release without due testing.
> > > >>>
> > > >>> At the risk of digressing, I also feel branch-2 should be in a
> > > releasable
> > > >>> state more often than not. The only way to bring any cadence and
> > > >>> predictability to our releases is by maintaining branch-2 in such
a
> > > state.
> > > >>>
> > > >>> On Tue, Sep 22, 2015 at 8:14 PM, Wangda Tan <wheeleast@gmail.com>
> > > wrote:
> > > >>>
> > > >>> > Hi Karthik,
> > > >>> >
> > > >>> > Let me elaborate more to make you feel better of this change,
> don't
> > > be
> > > >>> > scared by the size of the patch :)
> > > >>> >
> > > >>> > Common RM/Scheduler part:
> > > >>> > - AbstractYarnScheduler new logic only.
> > > >>> > - AppSchedulingInfo new logic only.
> > > >>> > - RMContainer / RMNode state machine, new logic only
> > > >>> > - SchedulerApplicationAttempt / Allocation, refactoring to
> existing
> > > >>> > reservation logic so increase request reservation can reuse
it,
> and
> > > >>> > refactored to simply updating container token / pull container
> part
> > > so
> > > >>> > increase/decrease/new-allocation can reuse same code.
> > > >>> >
> > > >>> > FairScheduler:
> > > >>> > - Small change since we updated how to pull container updated
> > token.
> > > I
> > > >>> > believe it will be a straightforward change for you if you
take a
> > > closer
> > > >>> > look at it.
> > > >>> >
> > > >>> > CapacityScheduler:
> > > >>> > - Most changes are separate logic or small refactorings,
most
> > complex
> > > >>> > allocation logic stays within IncreaseContainerAllocator.java.
> > > >>> >
> > > >>> > Please let me know where you want to get more details of
> > > >>> implementations.
> > > >>> >
> > > >>> > I strongly suggest you to take a glance at the diff, we have
> > already
> > > >>> worked
> > > >>> > on the merge for the past one week, and we've paid a lot
of extra
> > > time
> > > >>> to
> > > >>> > keep YARN-1197 sync with trunk in the past several months.
After
> > this
> > > >>> merge
> > > >>> > finished, a couple of weeks needed to finish end-to-end test
and
> > some
> > > >>> other
> > > >>> > extra tests, it won't affect our upcoming branch-2 release.
> > > >>> >
> > > >>> > I would not prefer to merge to trunk only, all people working
on
> RM
> > > side
> > > >>> > will be affected, we're very carefully avoid such divergence
of
> RM
> > in
> > > >>> > trunk/branch-2. Since nobody wants to create two different
> patches
> > > for
> > > >>> > every RM changes. And also, after this finished, other efforts
> can
> > > >>> happen
> > > >>> > in parallel such as YARN-4091.
> > > >>> >
> > > >>> > Let me know if you have any other questions/concerns.
> > > >>> >
> > > >>> > Thanks,
> > > >>> > Wangda
> > > >>> >
> > > >>> > On Tue, Sep 22, 2015 at 7:26 PM, Karthik Kambatla <
> > > kasha@cloudera.com>
> > > >>> > wrote:
> > > >>> >
> > > >>> > > I am sorry, but merging a potentially disruptive change
to
> > branch-2
> > > >>> > without
> > > >>> > > end-to-end tests seems too disruptive to me.
> > > >>> > >
> > > >>> > > I do agree with you on the potential inconvenience of
having to
> > > post
> > > >>> > > different patches for trunk and branch-2, but I would
rather
> have
> > > that
> > > >>> > > inconvenience than the risk of merging something that
hasn't
> been
> > > >>> > > thoroughly tested.
> > > >>> > >
> > > >>> > > On Tue, Sep 22, 2015 at 6:18 PM, Wangda Tan <
> wheeleast@gmail.com
> > >
> > > >>> wrote:
> > > >>> > >
> > > >>> > > > Hi Karthik,
> > > >>> > > >
> > > >>> > > > Thanks for comments! However, I think only merge
to trunk may
> > not
> > > >>> work,
> > > >>> > > > this patch involves thousands lines of code changes
in
> > scheduler
> > > >>> side,
> > > >>> > > only
> > > >>> > > > putting that to trunk could lead to trunk/branch-2
totally
> > > >>> incompatible
> > > >>> > > for
> > > >>> > > > resource manager. I think most of the code changes
are new to
> > > >>> scheduler
> > > >>> > > > instead of modifying existed logic, they're not
very tricky
> to
> > > me.
> > > >>> And
> > > >>> > > when
> > > >>> > > > 2.8 will be released is not planned yet, at least
we have a
> > > couple
> > > >>> of
> > > >>> > > > months to make sure this feature becomes usable
and not cause
> > > >>> existing
> > > >>> > > > behavior regressions.
> > > >>> > > >
> > > >>> > > > Sounds good to you?
> > > >>> > > >
> > > >>> > > > Wangda
> > > >>> > > >
> > > >>> > > >
> > > >>> > > > On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla
<
> > > >>> kasha@cloudera.com>
> > > >>> > > > wrote:
> > > >>> > > >
> > > >>> > > > > +1 on merging to trunk. It would be nice to
have some
> amount
> > of
> > > >>> > testing
> > > >>> > > > > done before the merge, but I understand how
merging to
> trunk
> > > would
> > > >>> > > likely
> > > >>> > > > > speed up the testing efforts.
> > > >>> > > > >
> > > >>> > > > > Let us not merge into branch-2 until after
we have done a
> > fair
> > > >>> bit of
> > > >>> > > > > testing, and are comfortable including it
in a release.
> While
> > > the
> > > >>> > code
> > > >>> > > > > mostly appears to not mess with existing scheduling
logic,
> I
> > am
> > > >>> > > concerned
> > > >>> > > > > about regressions to existing scheduling behavior.
> > > >>> > > > >
> > > >>> > > > >
> > > >>> > > > >
> > > >>> > > > > On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla
<
> > > >>> > kasha@cloudera.com>
> > > >>> > > > > wrote:
> > > >>> > > > >
> > > >>> > > > > > By the way, for the purposes of merge
vote, I believe a
> > > >>> committer's
> > > >>> > > > vote
> > > >>> > > > > > is binding. So, Wangda and Zhihai's votes
should be
> > binding.
> > > :)
> > > >>> > > > > >
> > > >>> > > > > > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai
Xu <
> > > zxu@cloudera.com>
> > > >>> > > wrote:
> > > >>> > > > > >
> > > >>> > > > > >> +1 (non-binding)
> > > >>> > > > > >>
> > > >>> > > > > >> thanks
> > > >>> > > > > >> Zhihai Xu
> > > >>> > > > > >>
> > > >>> > > > > >> On Tue, Sep 22, 2015 at 12:10 AM,
Xuan Gong <
> > > >>> > xgong@hortonworks.com>
> > > >>> > > > > >> wrote:
> > > >>> > > > > >>
> > > >>> > > > > >> > +1 Binding
> > > >>> > > > > >> >
> > > >>> > > > > >> > Thanks
> > > >>> > > > > >> >
> > > >>> > > > > >> > Xuan Gong
> > > >>> > > > > >> >
> > > >>> > > > > >> > > On Sep 22, 2015, at 12:03
AM, Junping Du <
> > > >>> jdu@hortonworks.com
> > > >>> > >
> > > >>> > > > > wrote:
> > > >>> > > > > >> > >
> > > >>> > > > > >> > > +1. (Binding).
> > > >>> > > > > >> > >
> > > >>> > > > > >> > > Thanks,
> > > >>> > > > > >> > >
> > > >>> > > > > >> > > Junping
> > > >>> > > > > >> > > ________________________________________
> > > >>> > > > > >> > > From: Wangda Tan <wheeleast@gmail.com>
> > > >>> > > > > >> > > Sent: Thursday, September
17, 2015 3:19 AM
> > > >>> > > > > >> > > To: yarn-dev@hadoop.apache.org
> > > >>> > > > > >> > > Subject: Re: [VOTE] Merge
YARN-1197 container resize
> > > into
> > > >>> > trunk
> > > >>> > > > > >> > >
> > > >>> > > > > >> > > +1 (non-binding),
> > > >>> > > > > >> > >
> > > >>> > > > > >> > > Thanks Jian starting this
thread. This can minimize
> > > effort
> > > >>> of
> > > >>> > > > works
> > > >>> > > > > >> > across branches.
> > > >>> > > > > >> > >
> > > >>> > > > > >> > > To clarify, this feature
is end-to-end code
> > completed,
> > > we
> > > >>> > have
> > > >>> > > > API,
> > > >>> > > > > >> > rm/nm implementations patches
committed, but we
> haven't
> > > >>> tested
> > > >>> > it
> > > >>> > > > > >> > end-to-end. Filed YARN-4175
to create an example
> program
> > > to
> > > >>> test
> > > >>> > > it
> > > >>> > > > > >> > end-to-end.
> > > >>> > > > > >> > >
> > > >>> > > > > >> > > Regards,
> > > >>> > > > > >> > > Wangda
> > > >>> > > > > >> > >
> > > >>> > > > > >> > >> On Sep 16, 2015, at
6:30 PM, Jian He <
> > > jhe@hortonworks.com
> > > >>> >
> > > >>> > > > wrote:
> > > >>> > > > > >> > >>
> > > >>> > > > > >> > >> Hi All,
> > > >>> > > > > >> > >>
> > > >>> > > > > >> > >> Thanks Meng Ding and
Wangda Tan for all the hard
> > work !
> > > >>> > > > > >> > >>
> > > >>> > > > > >> > >> I would like to call
a vote to merge YARN-1197
> > > container
> > > >>> > resize
> > > >>> > > > > into
> > > >>> > > > > >> > trunk.
> > > >>> > > > > >> > >>
> > > >>> > > > > >> > >> Key idea:
> > > >>> > > > > >> > >> This feature adds the
ability for AM to change
> > > container
> > > >>> > > resource
> > > >>> > > > > >> size
> > > >>> > > > > >> > at runtime.
> > > >>> > > > > >> > >>
> > > >>> > > > > >> > >> Details:
> > > >>> > > > > >> > >> - This feature is tracked
at
> > > >>> > > > > >> > https://issues.apache.org/jira/browse/YARN-1197
> > > >>> > > > > >> > >> - It’s currently
developed at a separate branch:
> > > >>> > > > > >> > https://github.com/apache/hadoop/commits/YARN-1197
> > > >>> > > > > >> > >> - A uber patch(
> > > >>> > https://issues.apache.org/jira/browse/YARN-4157
> > > >>> > > )
> > > >>> > > > > >> > generated from YARN-1197 to
run against trunk  shows
> all
> > > unit
> > > >>> > > tests
> > > >>> > > > > have
> > > >>> > > > > >> > passed.
> > > >>> > > > > >> > >> - This feature now
can work end-to-end.
> > > >>> > > > > >> > >> -  All the unresolved
jiras under YARN-1197 will be
> > the
> > > >>> next
> > > >>> > > > step.
> > > >>> > > > > >> > >>
> > > >>> > > > > >> > >> Thanks,
> > > >>> > > > > >> > >> Wangda Tan & Meng
Ding & Jian He
> > > >>> > > > > >> > >
> > > >>> > > > > >> > >
> > > >>> > > > > >> >
> > > >>> > > > > >> >
> > > >>> > > > > >>
> > > >>> > > > > >
> > > >>> > > > > >
> > > >>> > > > >
> > > >>> > > >
> > > >>> > >
> > > >>> >
> > > >>>
> > > >>
> > > >>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message