www-infrastructure-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: @apache.org commit address requirement (Was: Git hosting is go)
Date Wed, 21 Dec 2011 17:37:05 GMT
On Tue, Dec 20, 2011 at 9:57 PM, Roy T. Fielding <fielding@gbiv.com> wrote:
> On Dec 20, 2011, at 5:46 PM, Paul Davis wrote:
>> On Tue, Dec 20, 2011 at 7:03 PM, David Jencks <david_jencks@yahoo.com> wrote:
>>> On Dec 20, 2011, at 3:48 PM, Paul Davis wrote:
>>>> On Tue, Dec 20, 2011 at 2:22 PM, Jeremy Thomerson
>>>> <jeremy@thomersonfamily.com> wrote:
>>>>> On Tue, Dec 20, 2011 at 3:04 PM, Paul Davis <paul.joseph.davis@gmail.com>wrote:
>>>>>> On Mon, Dec 19, 2011 at 5:15 PM, Jukka Zitting <jukka.zitting@gmail.com>
>>>>>> wrote:
>>> <giant snip>
>>>> Once again I'm going to point out that current patches must move
>>>> through JIRA. Assuming people follow this policy then the %ce field is
>>>> by definition an ASF committer on the Apache project in question. Full
>>>> stop.
>>> What exactly do you mean by "patch moves through jira" for git?
>> Ie, the patch goes to JIRA, the ASF committer the downloads the patch
>> from JIRA, applies, reviews, and then if it checks out, finally pushes
>> it to master (or where ever is appropriate for integration based on
>> that project's workflows). Cassandra has scripts [1] that automate
>> most of this.
> Apache projects are not required to use Jira.  Contributions can
> be contributed using any of our communication forums and they are
> considered to be under the Apache License 2.0.  If the author happens
> to have a CLA on file, then the CLA overrides the normal contribution
> license automatically -- there is no need to check that.
> There is no reason to apply this extra level of control within
> infrastructure for checking things that any reasonably competent
> committer can be trusted to do themselves.  And there is a known
> reason not to do so, namely that the committer field in git has
> nothing to do with the provenance of the code, but may in fact
> vary for the same individual depending on whether they are
> interacting with a public repository or their work's repository,
> or maybe even their club's repository.  Github is certainly one
> example where the committer names will not match our avail names,
> and one of the goals of this effort is to enable folks to
> use Github as one of many forums for collaborating with potential
> recruits.
> Yes, that opinion comes from me speaking as a board member and
> author of the Apache License, and has previously been cleared
> with Apache's legal team for a long ago discussion with Incubator.
> We don't need a CLA on file to accept contributions from non-committers.
> We just need a clear intent by the author to contribute under
> our normal terms.


>>> I think it means that there's a jira issue in apache jira with a  pointer to
the (set of) git commits and the commit message in git has the jira #.  On the projects I
work with (with svn) the mention of the jira in the commit message results in jira being able
to link to the changes, and we expect all committers to have a jira issue for all non-trivial
changes.  I hope the same will be true for git.  I think a pointer to the git change set
in some repo is equivalent to attaching a patch to a jira issue.
>> I'm not sure how a pointer to some change set satisfies the policy.
>> The underlying motivation for submitting the patch to JIRA is to
>> indicate "I submit this code to be included under the ASL 2.0" which
>> doesn't seem to hold up if the code isn't actually attached to the
>> ticket with the little check box clicked.
> We have archives on all of our communication channels.  We don't
> need the silly checkbox.  We never have.

The issue I was trying to address with this is that pulling from
GitHub doesn't go through any communication channel that indicates the
contribution was intentional.

>>> I'm really not understanding what the point of doing anything other than checking
that the person who pushes the work into the asf repository is an asf committer on the project.
 That's all we do for svn, right?  We can do reports or whatever on the additional git metadata
but until there's a demonstrated problem I don't see why we need to solve it.
>>> thanks
>>> david jencks
>> Git is not the same as SVN. This is specifically dealing with the
>> distributed nature of Git and how we can deal with enforcing
>> constraints on code uploaded to ASF canonical repos that may have come
>> from anywhere. As I've tried to point out if we're just going to
>> maintain the same policies we have for SVN then this is all moot
>> because patches would have to be applied by hand instead being pulled
>> from arbitrary remote repos.
> No, they wouldn't.  What makes you think that I can't implement a
> tool to apply changes found on a forked subversion instances?
> I certainly have the right to do so as an Apache committer.
> That is a common operating procedure for some of our projects
> where we have a commercially-supported fork in house.
> I am trusted to commit to the ASF repository only those changes
> that are intended for contribution to Apache.  Git just includes
> those tools for us and standardizes the process/identifiers.
>> The entire motivation for having these
>> checks is to maintain the provenance for contributions without
>> requiring that every patch moves through JIRA.
> Again, there is no such requirement for commits/pushes at Apache.
> The person responsible for moving the bits into our repository
> is responsible for verifying that they have the right to do so
> before the push is made.  The authors do not need to have a CLA
> on file even if the contribution is massive -- CLAs are only
> required for the people who want an account at Apache and thus
> are allowed to make the decision to push those bits into our
> repository.

This contradicts basically every policy I've ever heard on accepting
contributions as a committer. It's quite possible that I've completely
misunderstood everything I've been told. Although everyone I've talked
to (small sample) has expressed the same confused reaction to how this
doesn't jive with policy as taught by the incubator.

> The CLA has requirements on submitting third-party contributions
> that must be adhered to when pushing stuff to Apache.  We expect
> that those requirements will be satisfied by the commit log.
> As it turns out, that is best accomplished by ensuring that the
> original committer and author identifiers remain those of the
> original author and not that of the pusher, even if it is the
> same person (with different IDs on different repos).  If not,
> the pusher needs to change the log in order to add a
> "Submitted-by: ..." note and whatever else needs to be said in
> accordance with the CLA.  This is independent of how the
> contribution is originally submitted to Apache, and it is the
> PMC's responsibility to ensure all its committers do so when
> appropriate.
> ....Roy

Just a note, the author fields are about the only thing you can make
reasonable guarantees won't change. And even then that's if committers
take care to make sure it doesn't change in some specific
circumstances that anyone not familiar with Git will be prone to no

Also, specifically adding a "Submitted-by:" field (signed-off-by?)
will change everything that's not the author field. The entire point
of Signed-off-by was to record this information because the sha and
committer information will change as patches are emailed around. I'd
also strongly caution about forming any policy that resolves around
commit sha's and committer information not changing.

That said, you've made it clear that the board's position is that
these checks are unnecessary so I'll remove them.

View raw message