mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco de Abreu <marco.g.ab...@googlemail.com.INVALID>
Subject Re: Squash/Merge PRs
Date Thu, 12 Jul 2018 22:21:44 GMT
Fully agree, if we can get our commit hygiene up to industry standard, I
don't see any problems in using rebase merge instead. For the short term I
agree that it should be fine to rebase merge individual PRs with multiple
contributors. But in my opinion we should then insist on people amending
their commit history if we deem it below our standard. A PR should not be
rebase merged if the history is not proper - verifying that is the
responsibility of the merging committer, but ideally, we'd make people
aware of problems early on. What do you think?

In general, I think we should gradually start taking this into account when
reviewing with the goal of all PRs having a proper commit history. This
would allow us in the long term to completely move away from squashing.

-Marco

Pedro Larroy <pedro.larroy.lists@gmail.com> schrieb am Fr., 13. Juli 2018,
00:07:

> This is a great discussion, thanks for raising the question Naveen.
>
> From my experience working in all kinds of software project of varying
> sizes and flavours:
>
>    1. People should be aware of git rebase interactive and have more
>    hygiene in their PRs. If a PR is hygienic and is separated in a set of
>    semantic commits, it's better not squashed as it helps finding bugs /
>    bisecting. A "dirty" PR with WIP commits, is better squashed, or
> requested
>    to rebase interactively on CR.
>    2. Mixing different semantic changes on a single PR is an anti-pattern,
>    for example fixing whitespace or reformatting many lines and changing
> some
>    code which is hidden in a bunch of trivial changes and could cause a
> bug or
>    major change of behaviour.
>    3. Agreed what with others have mentioned about small incremental steps
>    better than huge PRs. We also have JIRA or issues to relate a set of PRs
>    together. If not possible, PR with a set of non squashed commits is also
>    good.
>
> Hope it helps.
>
> Pedro.
>
> On Thu, Jul 12, 2018 at 11:47 PM Naveen Swamy <mnnaveen@gmail.com> wrote:
>
> > @Aaron, I do not think most contributors(SDE or not) were even aware that
> > their commits are getting squashed into one and thereby others losing
> > credit on that work. I would assume no bad intentions there.
> >
> > @Hao,
> > Agree to breaking down into small and reasonable sized PRs, but I think
> > very small PRs will overwhelm the CI as it stands since it runs
> > everything(this is a separate topic and needs fixing).
> >
> > For cases similar to Aaron's he can raise the PR for the doc
> > work(regardless of fork or not) and add other contributors as co-authors.
> >
> > @Marco, co-author might not be suitable always suitable and agree with
> Hao
> > that should be used on exceptions. co-author also makes it hard to see
> the
> > contributions individually.
> >
> > @Marco, why can we not have Rebase merge enabled on the repo and use that
> > when there are multiple contributors. This discussion is only about Not
> > Squashing when there are multiple contributors and agree to continue the
> > practice of Squashing in general.
> >
> > Until the tooling is fixed, I suggest to use the co-author feature when
> > collaborating on a fork.
> >
> > Also, I just want to reiterate and request contributors to have
> meaningful
> > and fewer commits on PRs.
> >
> > Thanks, Naveen
> >
> >
> > On Thu, Jul 12, 2018 at 11:40 AM, Marco de Abreu <
> > marco.g.abreu@googlemail.com.invalid> wrote:
> >
> > > As of now it will require the usage of a merge bot as far as I
> understand
> > > this issue: https://github.com/python/miss-islington/issues/16.
> Instead
> > of
> > > using the web interface do merge, we'd then trigger the bot to do the
> > merge
> > > on our behalf. There are pre-made solutions on the internet we might be
> > > able to leverage, but it would take some time to get into it and adapt
> > our
> > > process.
> > >
> > > Additionally, GitHub has this feature request tracked internally. Let's
> > see
> > > when they get to add it.
> > >
> > > -Marco
> > >
> > >
> > >
> > > Aaron Markham <aaron.s.markham@gmail.com> schrieb am Do., 12. Juli
> 2018,
> > > 21:33:
> > >
> > > > My point was about collaboration, or the lack thereof, and how the
> > > tooling
> > > > and apparent rewards from contribution statistics can reinforce lone
> > > ranger
> > > > behavior. People can and should be proud of their work, but why does
> it
> > > > have to be alone? Working within the context of a team is something
> to
> > be
> > > > proud of too, but if your stats and standing are graded by how the
> > > commits
> > > > and merges land, and counting lines of code, then incentives of the
> > > system
> > > > are skewed.
> > > > How do we go about implementing the co-author process? It sounds like
> > > > something worth doing!
> > > >
> > > > On Thu, Jul 12, 2018 at 11:15 AM, Hao Jin <hjjn.amzn@gmail.com>
> wrote:
> > > >
> > > > > Hi Aaron,
> > > > > To be fair, this discussion has nothing to do with "pride" of SDEs,
> > but
> > > > > rather a discussion on what is a better software engineering
> practice
> > > for
> > > > > the project. Breaking a large project into smaller tasks is a good
> > > > software
> > > > > engineering practice. This article(https://arxiv.org/pdf/
> > > 1702.01715.pdf)
> > > > > on
> > > > > Google's software engineering practice says: "Engineers are
> > encouraged
> > > to
> > > > > keep each individual change small​, with larger changes preferably
> > > broken
> > > > > into a series of smaller changes that a reviewer can easily review
> in
> > > one
> > > > > go.". If you have concerns or your comments on this practice, we
> can
> > > take
> > > > > the discussion offline. On the other hand, an important spirit of
> > > Apache
> > > > > community is that "one must interact with others, and share vision
> > and
> > > > > knowledge"(https://community.apache.org/contributors/), if you
> > > observed
> > > > > that a majority of the contributors have serious problems with
> their
> > > > > writing maybe you can share some tips and hints on how to write
> > better
> > > > > documentations, in that way not only a lot of us within the
> community
> > > can
> > > > > have some growth but also you can save some time for writing more
> > good
> > > > > documentations and blogposts for MXNet, don't you think so? Or, if
> > you
> > > > only
> > > > > have to fix the doc for someone once in a while, I definitely agree
> > > that
> > > > > you should be given the credit for that, and that's where the
> > co-author
> > > > > field can be useful, which was exactly what I was saying in my
> > previous
> > > > > email. Thanks!
> > > > > Hao
> > > > >
> > > > > On Thu, Jul 12, 2018 at 8:16 AM, Aaron Markham <
> > > > aaron.s.markham@gmail.com>
> > > > > wrote:
> > > > >
> > > > > >  This is a great discussion, and close to my heart, because
I
> want
> > to
> > > > > > include more writers and editors in our community, and I'm
> > struggling
> > > > to
> > > > > > see how to best manage this. It seems like being the sole
> > contributor
> > > > on
> > > > > a
> > > > > > feature is like an engineer's Lone Ranger badge of pride! I
feel
> > like
> > > > it
> > > > > > should be a red flag.
> > > > > >
> > > > > > I work in collaboration with dozens of engineers to improve
their
> > > > > > documentation, explain their features, change flows to improve
> user
> > > > > > experience, and sometimes fix bugs and write code. When the
PR's
> > docs
> > > > has
> > > > > > great coverage, is clear, and not loaded with bad grammar and
> > > spelling
> > > > > > mistakes, I will put comments in a review. But sometimes there
> > needs
> > > to
> > > > > be
> > > > > > a complete rework such that hundreds of review comments isn't
> > > feasible,
> > > > > and
> > > > > > they can't be properly addressed. In a lot of these cases, I
> commit
> > > my
> > > > > > changes to someone else's fork. Now I know the people I work
with
> > on
> > > > > their
> > > > > > PRs appreciate my help, but when all of this work gets squashed
> > down
> > > to
> > > > > one
> > > > > > commit, I'm pretty much regularly getting squashed out. I'm
not
> > sure
> > > > who
> > > > > > else is in this boat, but does the community recognize our
> > > > contributions
> > > > > > when this is the general mode of operation?
> > > > > >
> > > > > > Regarding co-author - I'm not sure how people would feel about
me
> > > > being a
> > > > > > co-author on their work. Documentation and clarity in your work
> > > product
> > > > > is
> > > > > > important, but people's views on their personal *code*
> contribution
> > > > seems
> > > > > > to be more important than the overall code & feature quality
> itself
> > > > when
> > > > > > docs are part of the package. I feel that if I do follow-on
PRs
> > > instead
> > > > > of
> > > > > > fixing things before they are merged, that we would be releasing
> > > > > incorrect,
> > > > > > incomplete, and inferior product. But, in absence of a better
> > > solution,
> > > > > > maybe that's the pill we have to swallow.
> > > > > >
> > > > > > -1 on lots of small PRs (until we expand our range of committers
> > and
> > > > > reduce
> > > > > > the latency in reviews and merges).
> > > > > > +1 on improving the quality of commit messages, so we don't
have
> to
> > > > > squash
> > > > > > & merge all of the time.
> > > > > > +1 on improving the practice of better commit & merge management
> so
> > > > that
> > > > > > the commit history is concise and meaningful. (I'm guilty of
> this,
> > > and
> > > > > > certainly need to improve here.)
> > > > > > +1 on co-author - assuming that's what most everyone thinks
is a
> > good
> > > > > > solution for now. I'm unclear on how this gets managed though.
> > > > > >
> > > > > > Cheers,
> > > > > > Aaron
> > > > > >
> > > > > >
> > > > > > On Thu, Jul 12, 2018 at 5:19 AM, Anton Chernov <
> > mechernov@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Unfortunately there has been significant push back for
small
> > > > iterative
> > > > > > PR's
> > > > > > > and as a result they have grown substantially involving
> multiple
> > > > > > > contributors, multiple commits, sometimes multiple branches
to
> be
> > > > > worked
> > > > > > > on.
> > > > > > >
> > > > > > > I fully agree and support all points that Jin raised:
> > > > > > >
> > > > > > > 1) Most contributions should be broken down into smallest
> > possible
> > > > > PR's.
> > > > > > > 2) If a PR is small enough a single person can complete
it.
> > > > > > > 3) If a majority of PR is done by someone, and there's
some
> minor
> > > > issue
> > > > > > > he/she needs help with it could be done in a follow up
PR by
> > > anybody,
> > > > > > > including the reviewer.
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Anton
> > > > > > >
> > > > > > > чт, 12 июл. 2018 г. в 10:11, Hao Jin <hjjn.amzn@gmail.com>:
> > > > > > >
> > > > > > > > +1 for Marco's proposal on using the co-author field.
IMHO
> the
> > > > usage
> > > > > of
> > > > > > > > co-author field should not be that often, consider:
> > > > > > > > 1) If a PR is so big that it needs multiple people
to
> > contribute
> > > a
> > > > > > > > substantial part of it, why can't that PR be broken
down into
> > > > smaller
> > > > > > PRs
> > > > > > > > each submitted by single one of them?
> > > > > > > > 2) If a PR is small enough (for example, < 300
lines), why
> > can't
> > > a
> > > > > > single
> > > > > > > > person complete it?
> > > > > > > > 3) If a majority of PR is done by someone, and there's
some
> > minor
> > > > > issue
> > > > > > > > he/she needs help with(a small bug, incomplete doc),
why
> can't
> > > that
> > > > > be
> > > > > > > done
> > > > > > > > through code reviews?
> > > > > > > > From the above 3 situations we can see that collaborations
on
> > > > > > individual
> > > > > > > > PRs should not be quite often, but I do agree that
it could
> > still
> > > > be
> > > > > > > > necessary when someone lacks the related expertise/knowledge
> on
> > > > some
> > > > > > > > necessary part of that PR given that the required
knowledge
> > > cannot
> > > > be
> > > > > > > > picked up in a short period of time.
> > > > > > > >
> > > > > > > > I do agree that keeping the commit histories of PRs
clean is
> > very
> > > > > > > important
> > > > > > > > as it could be confusing when reviewing PRs, but that
really
> > > > depends
> > > > > on
> > > > > > > > personal preferences (For my case, I usually do "git
commit
> > > > --amend"
> > > > > on
> > > > > > > > trivial changes and get a new commit for bigger changes
such
> > as a
> > > > > > > > checkpoint of my whole PR). With growing the community
and
> > > > attracting
> > > > > > > more
> > > > > > > > contributors as a high priority for our project, I
would
> prefer
> > > > that
> > > > > we
> > > > > > > do
> > > > > > > > not put even more burden on the contributors by asking
them
> to
> > > > manage
> > > > > > and
> > > > > > > > squash the commits themselves just for the not-so-often
cases
> > of
> > > > > having
> > > > > > > > multiple contributors on a single PR.
> > > > > > > > Best regards,
> > > > > > > > Hao
> > > > > > > >
> > > > > > > > On Thu, Jul 12, 2018 at 12:35 AM, Marco de Abreu <
> > > > > > > > marco.g.abreu@googlemail.com.invalid> wrote:
> > > > > > > >
> > > > > > > > > Hi Naveen,
> > > > > > > > >
> > > > > > > > > I'm in favour of the squashing, considering the
number of
> > > commits
> > > > > in
> > > > > > > some
> > > > > > > > > PRs and especially because of some people making
commit
> > > messages
> > > > a
> > > > > la
> > > > > > > > "fix"
> > > > > > > > > "fix" "fix" all the time. Additionally, it gets
hard (not
> > > > > impossible,
> > > > > > > > just
> > > > > > > > > more inconvenient) to determine the atomic states
of
> master -
> > > > aka,
> > > > > > > which
> > > > > > > > > commits are separate from each other. You should
consider
> > that
> > > > > > > > intermediary
> > > > > > > > > commits are unstable (fail CI) and thus it could
be very
> hard
> > > to
> > > > > > bisect
> > > > > > > > > failures in future - and the commit history gets
cluttered.
> > > > > > > > >
> > > > > > > > > As alternative, I'd like to suggest the co-author
field for
> > > these
> > > > > > > cases.
> > > > > > > > > Further documentation is available at
> > > > > > > > >
> https://blog.github.com/2018-01-29-commit-together-with-co-
> > > > > authors/.
> > > > > > > > >
> > > > > > > > > I definitely agree with the second part. We should
all lead
> > by
> > > > > > example
> > > > > > > > and
> > > > > > > > > maintain a high quality by keeping our commit
messages
> clean
> > > and
> > > > > > > > > meaningful. When I receive an email notification
that a new
> > > > commit
> > > > > > has
> > > > > > > > been
> > > > > > > > > added and it only contains "fix" as title, it's
not that
> > > helpful
> > > > > and
> > > > > > > also
> > > > > > > > > it's hard to track the development of a PR overtime.
E.g.,
> > why
> > > > has
> > > > > > > > > something been changed? Was there maybe a bug
that we
> didn't
> > > > cover
> > > > > > with
> > > > > > > > > tests but the author just hacked something to
get it to
> work
> > > but
> > > > > the
> > > > > > > > > problem still lays somewhere? We won't know that
way and it
> > > makes
> > > > > it
> > > > > > > > harder
> > > > > > > > > for us to review.
> > > > > > > > >
> > > > > > > > > Best regards,
> > > > > > > > > Marco
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Naveen Swamy <mnnaveen@gmail.com> schrieb
am Do., 12. Juli
> > > 2018,
> > > > > > > 10:09:
> > > > > > > > >
> > > > > > > > > > Hi All,
> > > > > > > > > >
> > > > > > > > > > I am seeing that maintainers merge PRs into
the repo,
> they
> > > are
> > > > > > > > squashing
> > > > > > > > > > the commits in the PR, which I understand
and agree is to
> > > keep
> > > > a
> > > > > > sane
> > > > > > > > > > commit history, however this is causing
problem when
> there
> > > are
> > > > > > > multiple
> > > > > > > > > > contributors involved on a PR(by contributing
to a fork
> of
> > > the
> > > > > > repo)
> > > > > > > > this
> > > > > > > > > > effectively removes credit for multiple
contributors
> > involved
> > > > and
> > > > > > > shows
> > > > > > > > > all
> > > > > > > > > > code as authored by the contributor who
created the PR.
> > > > > > > > > >
> > > > > > > > > > Can I request maintainers to not squash
PRs if there are
> > > > multiple
> > > > > > > > > > contributors involved on the PR.
> > > > > > > > > >
> > > > > > > > > > Also on the same note, I request contributors(regardless
> of
> > > > > > multiple
> > > > > > > > > > contributors or not) to keep a clean commit
history by
> > > > squashing
> > > > > > the
> > > > > > > > > > commits and not push all your WIP commits
to the PR. this
> > > will
> > > > > help
> > > > > > > us
> > > > > > > > > keep
> > > > > > > > > > our commit history clean and meaningful.
> > > > > > > > > >
> > > > > > > > > > Let me know your thoughts/better approach
or If I have
> > > > > > misunderstood
> > > > > > > > how
> > > > > > > > > > this works.
> > > > > > > > > >
> > > > > > > > > > Thanks, Naveen
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message