www-infrastructure-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sam Ruby <ru...@intertwingly.net>
Subject Re: Git, history, protection, and other topics
Date Wed, 04 Nov 2015 15:08:34 GMT
On Wed, Nov 4, 2015 at 9:44 AM, David Nalley <david@gnsa.us> wrote:
> On Wed, Nov 4, 2015 at 9:34 AM, Sam Ruby <rubys@intertwingly.net> wrote:
>> On Wed, Nov 4, 2015 at 9:27 AM, David Nalley <david@gnsa.us> wrote:
>>> On Wed, Nov 4, 2015 at 7:43 AM, Sam Ruby <rubys@intertwingly.net> wrote:
>>>> On Tue, Nov 3, 2015 at 11:08 PM, David Nalley <david@gnsa.us> wrote:
>>>>> Hi folks,
>>>>>
>>>>> So earlier today I sent an email to PMCs@ indicating that we had
>>>>> turned on disabled fast forward commits and branch/tag deletion across
>>>>> all of the ASF git repositories. [1]
>>>>>
>>>>> The crux of the problem is that infrastructure had set the expectation
>>>>> that certain branches and tags were protected from force pushes or
>>>>> branch/tag deletion.
>>>>>
>>>>> It was recently discovered that a large number of our projects were
>>>>> doing their main branch of development outside of these protected
>>>>> branches, and not using the release branch and tag scheme that would
>>>>> leave them protected.  Some, were using branches with names like
>>>>> 'develop' while others had $project_foo.
>>>>>
>>>>> As a short-term, interim step to allow us to meet the expectation that
>>>>> the main we blocked fast-forward pushes and branch/tag deletion until
>>>>> we can figure out the best way to adequately address the situation.
>>>>>
>>>>> I don't know whether or not the situation is best addressed via policy
>>>>> or technical means, but the discussion here is designed to discover
>>>>> what that should look like, so that we can move past the admittedly
>>>>> blunt, and likely disruptive measure that we introduced today.
>>>>>
>>>>> So; let the discussions begin.
>>>>
>>>> It would be helpful to start with some goals and/or rationale behind
>>>> the current policy.
>>>>
>>>> I'll start with the assumption that "rewrite history" sounds scary.
>>>> I'm going to make the case the term "rewrite history" isn't accurate.
>>>>
>>>> To start with, a git repository is a set of objects.  We will focus on
>>>> commit objects.  Commits are identified by a hash of both content and
>>>> metadata.  Change anything, and you have a new object with a new hash.
>>>>   Push a new commit and you have both the old object and new object in
>>>> the repository.
>>>
>>> It's true that you have both old and new objects - but in the case of
>>> inadvertent changes that result in orphaned objects, or change the way
>>> 'history looks' (I'll use that term instead of 'rewrite history' since
>>> as you note, it's not entirely accurate) how does one unpick all of
>>> the changes to the way history looks and get that back to an original
>>> state? This is a very real problem I've seen in other projects - a
>>> well meaning developer uses --force and has the ability to disrupt the
>>> view of history, and short of force pushing again (leaving even more
>>> orphaned objects) it's hard reset the view of the repo to the way it
>>> was.
>>
>> To get something back to the original state, use 'git update-ref'
>> specifying the name of the reference and the previous commit.
>>
>> To do something considerably more complicated, use 'git cherry-pick'
>> where you can specify a set of commits that you want to apply.
>>
>
> (I'm a pessimist, forgive me :) )
> In my experience, when this goes catastrophically wrong this would be
> more difficult; though I think I could see how git-update-ref  could
> work to pull that off - though it's equally scary that git-update-ref
> can so easily change the view of history. I will play with this some
> today and see how well it goes.

Please do!

I agree with "equally scary".  We simply assign different levels to
the value of "scary" :-)

I find it is helpful to look at the underlying operations that occur.

A push adds new objects to a repository and then updates a reference.
In hooks, you can stop the update of a reference before it occurs.  Or
take actions (such as initiating email) after such a reference occurs.
With reflogs, full history is maintained.  You can pass references
like "master@{2}" to git update-ref to undo (possibly multiple)
updates.

If you run the script I provided in the previous email[1], and then go
into the clone1 directory, type the following commands:

  git log
  git reflog
  git update-ref -m "restore sanity" HEAD HEAD@{1}
  git log
  git reflow

My belief is that the appropriate response to the ability of trusted
committers to update tags in ways that orphan commits is to notify the
appropriate individuals (generally the relevant PMC, and potentially
the infrastructure team, if that is desired) and to include in that
email the relevant commit hashes which would be used in place of
relative information (like HEAD@{1}).

> --David

- Sam Ruby

[1] https://svn.apache.org/repos/private/foundation/board/github-discussion/explorations/rewrite-history.sh

Mime
View raw message