hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Busbey <bus...@cloudera.com>
Subject Re: about CHANGES.txt
Date Wed, 18 Mar 2015 20:20:09 GMT
On the matter of handling merges in the history, this comes up over in
Apache Accumulo where development follows a merge-forward model (commits go
oldest first and merge into newer branches). This means that every commit
on an older-but-still-active development branch eventually ends up merged
into the history of newer branches even when the issue was only relevant to
the older branch. The easiest problem with relying on just the git history
for changes then is that there's no way to programmatically know which of
the commits that show up in the log for a given release tag are relevant to
that release and which ones were only relevant to the older development
line.

-Sean

On Wed, Mar 18, 2015 at 2:59 PM, Colin P. McCabe <cmccabe@apache.org> wrote:

> Alan, can you forward those private conversations (or some excerpt
> thereof) to the list to explain the problem that you see?
>
> I have been using "git log" to track change history for years and
> never had a problem.  In fact, we don't even maintain CHANGES.txt in
> Cloudera's distribution including Hadoop.  It causes too many spurious
> conflicts during cherry picks so we just discard the CHANGES.txt part
> of the change when backporting things to our branches.  When you are
> backporting hundreds of patches, and each one has a conflict on
> CHANGES.txt (and generally, ALL of them do), it's just not worth it to
> hand-resolve those conflicts.
>
> I also wrote a script to compare which JIRAs were in which branches by
> doing a delta of the git commits.  It works pretty well.  You can even
> visualize merges in git if you want, with tools like gitk (or even
> plain old git log with the right options)
>
> Colin
>
>
> On Tue, Mar 17, 2015 at 11:21 AM, Allen Wittenauer <aw@altiscale.com>
> wrote:
> >
> >         Nope.  I’m not particularly in the mood to write a book about a
> topic that I’ve beat to death in private conversations over the past 6
> months other than highlighting that any solution needs to be able to work
> against scenarios like we had 3 years ago with four active release branches
> + trunk.
> >
> > On Mar 17, 2015, at 10:56 AM, Yongjun Zhang <yzhang@cloudera.com> wrote:
> >
> >> Thanks Ravi and Colin for the feedback.
> >>
> >> Hi Allen,
> >>
> >> You pointed out that "git log" has problem when dealing with branch that
> >> has merges, would you please elaborate the problem?
> >>
> >> Thanks.
> >>
> >> --Yongjun
> >>
> >> On Mon, Mar 16, 2015 at 7:08 PM, Colin McCabe <cmccabe@alumni.cmu.edu>
> >> wrote:
> >>
> >>> Branch merges made it hard to access change history on subversion
> >>> sometimes.
> >>>
> >>> You can read the tale of woe here:
> >>>
> >>>
> http://programmers.stackexchange.com/questions/206016/maintaining-svn-history-for-a-file-when-merge-is-done-from-the-dev-branch-to-tru
> >>>
> >>> Excerpt:
> >>> "....prior to Subversion 1.8. The files in the branch and the files in
> >>> trunk are copies and Subversion keeps track with svn log only for
> >>> specific files, not across branches."
> >>>
> >>> I think that's how the custom of CHANGES.txt started, and it was
> >>> cargo-culted forward into the git era despite not serving much purpose
> >>> any more these days (in my opinion.)
> >>>
> >>> best,
> >>> Colin
> >>>
> >>> On Mon, Mar 16, 2015 at 4:49 PM, Ravi Prakash <ravihoo@ymail.com>
> wrote:
> >>>> +1 for automating the information contained in CHANGES.txt. There are
> >>> some changes which go in without JIRAs sometimes (CVEs eg.) . I like
> git
> >>> log because its the absolute source of truth (cryptographically secure,
> >>> audited, distributed, yadadada). We could always use git hooks to
> force a
> >>> commit message format.
> >>>> a) cherry-picks have the same message (by default) as the original)b)
> >>> I'm not sure why branch-mergers would be a problem?c) "Whoops I missed
> >>> something in the previous commit" wouldn't happen if our hooks were
> >>> smartishd) "no identification of what type of commit it was without
> hooking
> >>> into JIRA anyway." This would be in the format of the commit message
> >>>>
> >>>> Either way I think would be an improvement.
> >>>> Thanks for your ideas folks
> >>>>
> >>>>
> >>>>
> >>>>     On Monday, March 16, 2015 11:51 AM, Colin P. McCabe <
> >>> cmccabe@apache.org> wrote:
> >>>>
> >>>>
> >>>> +1 for generating CHANGES.txt from JIRA and/or git as part of making
a
> >>>> release.  Or just dropping it altogether.  Keeping it under version
> >>>> control creates lot of false conflicts whenever submitting a patch and
> >>>> generally makes committing minor changes unpleasant.
> >>>>
> >>>> Colin
> >>>>
> >>>> On Sat, Mar 14, 2015 at 8:36 PM, Yongjun Zhang <yzhang@cloudera.com>
> >>> wrote:
> >>>>> Hi Allen,
> >>>>>
> >>>>> Thanks a lot for your input!
> >>>>>
> >>>>> Looks like problem a, c, d you listed is not too bad, assuming we
can
> >>> solve
> >>>>> d by pulling this info from jira as Sean pointed out.
> >>>>>
> >>>>> Problem b (branch mergers) seems to be a real one, and your approach
> of
> >>>>> using JIRA system to build changes.txt is a reasonably good way.
This
> >>> does
> >>>>> count on that we update jira accurately. Since this update is a
> manual
> >>>>> process, it's possible to have inconsistency, but may be not too
bad.
> >>> Since
> >>>>> any mistake found here can be remedied by fixing the jira side and
> >>>>> refreshing the result.
> >>>>>
> >>>>> I wonder if we as a community should switch to using your way, and
> save
> >>>>> committer's effort of taking care of CHANGES.txt (quite some save
> IMO).
> >>>>> Hope more people can share their thoughts.
> >>>>>
> >>>>> Thanks.
> >>>>>
> >>>>> --Yongjun
> >>>>>
> >>>>> On Fri, Mar 13, 2015 at 4:45 PM, Allen Wittenauer <aw@altiscale.com>
> >>> wrote:
> >>>>>
> >>>>>>
> >>>>>> I think the general consensus is don’t include the changes.txt
file
> in
> >>>>>> your commit. It won’t be correct for both branches if such
a commit
> is
> >>>>>> destined for both. (No, the two branches aren’t the same.)
> >>>>>>
> >>>>>> No, git log isn’t more accurate.  The problems are:
> >>>>>>
> >>>>>> a) cherry picks
> >>>>>> b) branch mergers
> >>>>>> c) “whoops i missed something in that previous commit”
> >>>>>> d) no identification of what type of commit it was without hooking
> into
> >>>>>> JIRA anyway.
> >>>>>>
> >>>>>> This is why I prefer building the change log from JIRA.  We
already
> >>> build
> >>>>>> release notes from JIRA, BTW.  (Not that anyone appears to read
them
> >>> given
> >>>>>> the low quality of our notes…)  Anyway, here’s what I’ve
been
> >>>>>> building/using as changes.txt and release notes:
> >>>>>>
> >>>>>> https://github.com/aw-altiscale/hadoop-release-metadata
> >>>>>>
> >>>>>> I try to update these every day. :)
> >>>>>>
> >>>>>> On Mar 13, 2015, at 4:07 PM, Yongjun Zhang <yzhang@cloudera.com>
> >>> wrote:
> >>>>>>
> >>>>>>> Thanks Esteban, I assume this report gets info purely from
the jira
> >>>>>>> database, but not "git log" of a branch, right?
> >>>>>>>
> >>>>>>> I hope we get the info from "git log" of a release branch
because
> >>> that'd
> >>>>>> be
> >>>>>>> more accurate.
> >>>>>>>
> >>>>>>> --Yongjun
> >>>>>>>
> >>>>>>> On Fri, Mar 13, 2015 at 3:11 PM, Esteban Gutierrez <
> >>> esteban@cloudera.com
> >>>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> JIRA already provides a report:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12327179&styleName=Html&projectId=12310240
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> cheers,
> >>>>>>>> esteban.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Cloudera, Inc.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, Mar 13, 2015 at 3:01 PM, Sean Busbey <busbey@cloudera.com
> >
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> So long as you include the issue number, you can
automate pulling
> >>> the
> >>>>>>>> type
> >>>>>>>>> from jira directly instead of putting it in the
message.
> >>>>>>>>>
> >>>>>>>>> On Fri, Mar 13, 2015 at 4:49 PM, Yongjun Zhang <
> >>> yzhang@cloudera.com>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> I found that changing CHANGES.txt when committing
a jira is
> error
> >>>>>> prone
> >>>>>>>>>> because of the different sections in the file,
and sometimes we
> >>> forget
> >>>>>>>>>> about changing this file.
> >>>>>>>>>>
> >>>>>>>>>> After all, git log would indicate the history
of a branch. I
> >>> wonder if
> >>>>>>>> we
> >>>>>>>>>> could switch to a new method:
> >>>>>>>>>>
> >>>>>>>>>> 1. When committing, ensure the message include
the type of the
> >>> jira,
> >>>>>>>> "New
> >>>>>>>>>> Feature", "Bug Fixes", "Improvement" etc.
> >>>>>>>>>>
> >>>>>>>>>> 2. No longer need to make changes to CHANGES.txt
for each commit
> >>>>>>>>>>
> >>>>>>>>>> 3. Before releasing a branch, create the CHANGES.txt
by using
> "git
> >>>>>> log"
> >>>>>>>>>> command for the given branch..
> >>>>>>>>>>
> >>>>>>>>>> Thanks.
> >>>>>>>>>>
> >>>>>>>>>> --Yongjun
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Sean
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>
> >
>



-- 
Sean

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message