www-infrastructure-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brett Porter <br...@apache.org>
Subject Re: @apache.org commit address requirement (Was: Git hosting is go)
Date Wed, 21 Dec 2011 07:05:38 GMT
[snip a whole lot of very useful information about contributions, CLAs and the license that
I agree with Roy about]

On 21/12/2011, at 2:57 PM, Roy T. Fielding wrote:

> The CLA has requirements on submitting third-party contributions
> that must be adhered to when pushing stuff to Apache.  We expect
> that those requirements will be satisfied by the commit log.
> As it turns out, that is best accomplished by ensuring that the
> original committer and author identifiers remain those of the
> original author and not that of the pusher, even if it is the
> same person (with different IDs on different repos).  If not,
> the pusher needs to change the log in order to add a
> "Submitted-by: ..." note and whatever else needs to be said in
> accordance with the CLA.  This is independent of how the
> contribution is originally submitted to Apache, and it is the
> PMC's responsibility to ensure all its committers do so when
> appropriate.

TL;DR summary: given that, I think the check is still useful (summary & question at the
end if you don't wish to educate me on git)

I am a relative newbie to git, so I may just need someone to explain something to me more.

In some situations Roy has suggested changing the commit log would be required. My understanding
is that requires something like:

$ git rebase -i

This will also change the "Committer" field, leaving "Author" intact.

Jukka has suggested "Signed-off-by:", which I understand can be done with:

$ git --amend -s

This will also change the "Committer" field, leaving "Author" intact, making the change fairly
pointless for our purposes. It can't be done easily to commits > 1 deep in the history.

Unless I'm missing something, any change to the commit logs means Committer is made to be
your own address. Is that right?

If so, we are only talking about examples where the commit log remains 100% intact.

Working through an example:

A pull request referred to by Jukka: https://github.com/callback/callback-ios/pull/36
Contains a commit: https://github.com/forcedotcom/callback-ios/commit/1c3a5b3d70ede5853f1cae35a6159d48192685cb

commit 1c3a5b3d70ede5853f1cae35a6159d48192685cb
Author:     Todd Stellanova <tstellanova@salesforce.com>
AuthorDate: Fri Nov 11 14:20:52 2011 -0800
Commit:     Todd Stellanova <tstellanova@salesforce.com>
CommitDate: Fri Nov 11 14:20:52 2011 -0800

    refractor PhoneGapDelegate to allow teardown and reinit of web view

My understanding is that doing this is normal:
$ git fetch forcedotcom
$ git merge forcedotcom/master
$ git push origin master

That would leave the commit message above intact.

As a project member, that information seems unhelpful, as we need to go to another source
(apparently from the git server ref-update.log) to find out which ASF committer actually put
it into the tree. The mail notification may indicate that (I haven't checked what we do there),
but it's still detaching some important information from the source control.

Adjusting the committer ID is easy:
$ git rebase --no-ff
(or if just a few need to be handled, git rebase -i)

This does mean all the commit IDs get changed (same as if you have to edit a commit log above,
or you alter the commit stream it in some other way). But the downstream user can still work
with that, either with:

$ git merge origin/trunk  (some extra commits in the local repository)
$ git rebase origin/trunk (catch up with origin again)

There is the case of someone different from the author committing to the other repository:
we would overwrite some information in this case. I can't, however, think of a use case where
we care about that information. The author is important, and the ASF committer is important
- in a trade off, I would take the ASF committer information over the some-other-repo committer.

Summary - I see the committer check with the following pros:
- ensure commits contain information about which ASF committer was involved (not for provenance,
but for project members to know what is going on with a commit)
- prevent accidentally committing rubbish information (default server email address, work
email address you didn't mean to publish on the web)
And cons:
- inconvenience of rebasing
- losing track of committers to other repositories that are not the author.

I would presume "committer check" means allowing any name & email that they elect to use,
as Paul has described. I think it also means not requiring rebasing of commits from other
ASF committers.

On the weight of that, I would find the check worthwhile. I also see no advantage in replacing
it by a signed-off-by check.

Do others disagree with this conclusion but agree on the data points, or have I missed some
data, pros or cons?


Brett Porter

View raw message