hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: HADOOP-7106: Re-organize hadoop subversion layout
Date Wed, 20 Apr 2011 05:58:44 GMT
On Tue, Apr 19, 2011 at 10:20 PM, Todd Lipcon <todd@cloudera.com> wrote:

>
> I'm currently looking into how the git mirrors are setup in Apache-land.
>

Git-wise, I think we have two options:

Option 1)
- Create a new git mirror for the new hadoop/ tree. This will have no
history.
- On the Apache side, fetch the split-project git mirrors into the combined
git mirror as branches - eg hadoop-hdfs.git:trunk becomes a branch named
something like pre-HADOOP-7106/hdfs/trunk. Thus, when any user fetches,
he'll get all the git objects from "prehistory" as well without having to
add separate remotes.
- Add a script or README file explaining how to set up git grafts on the
combined hadoop.git so that the new combination branch "foo" looks like a
merge of pre-HADOOP-7106/{hdfs,common,mapred}/foo. Since git grafts are
local constructs, each git user would have to run this script once after
checking out the git tree, after which the history would be "healed"

Pros:
 - all existing sha1s stay the same.
 - Any local branches people might have for works in progress should
continue to refer to proper SHA1s and should rebase relatively easily onto
the combined trunk
 - Should be reasonably simple to implement

Cons:
 - users have to run a script upon checkout in order to graft back together
history

Option 2)
- Use git-filter-branch on the split repos to rewrite them as if they always
took place in their new subdirectories.
- Fetch these repos into the merged repo
- Set up grafts in the merged repo
- Run git-filter-branch --all in the merged repo, which will make the grafts
permanent
- May have to run git-filter-branch to rewrite some of the git-svn-info:
commit messages to trick git-svn.

This option basically rewrites history so that it looks like the original
project split did what we're planning to do now.

Pros:
 - we have a single cohesive git repo with no need to have users set up
grafts

Cons:
 - all of our SHA1s between the original split and now would change (making
it harder to rebase local branches for example)
 - way more opportunity for error, I think.

I'm leaning towards option 1 above, and happy to write the script which
installs the grafts into the user's local repo.

-Todd


>
>> On Apr 9, 2011, at 11:09 PM, Nigel Daley wrote:
>>
>> All,
>>
>> As discussed in Jan/Feb, I'd like to coordinate a date for committing the
>> re-organization of our svn layout:
>> https://issues.apache.org/jira/browse/HADOOP-7106.  I propose Thursday
>> April 21 at 11am PDT.
>>
>> - I will send out reminders leading up to that date.
>> - I will announce on IRC when I'm about to start the changes.
>> - I will run the script to make the changes.
>> - Ian, can you update the asf-authorization-template file and the
>> asf-mailer.conf files at the same time?
>> - Owen/Todd/Jukka, can you make sure that actions needed by git users are
>> taken care of at the same time? (what are these?)
>>
>> More info on this change is at http://wiki.apache.org/hadoop/ProjectSplit
>>
>> Cheers,
>> Nige
>>
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message