hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Evans <ev...@yahoo-inc.com>
Subject Re: HADOOP-7106 (project unsplit) this weekend
Date Mon, 13 Jun 2011 14:05:01 GMT
Could someone unlock some of these branches for anonymous read only checkout? At least with
MR-279 I get a 403 forbidden error when I try to check out.


On 6/12/11 6:38 PM, "Todd Lipcon" <todd@cloudera.com> wrote:

OK, this seems to have succeeded without any big problems!

I've re-enabled the git mirrors and the hudson builds. Feel free to commit
to the new trees.

Here are some instructions for the migration:

=== SVN users ===

Next time you "svn up" in your "common" working directory you'll end up
seeing the combined tree - ie a mapreduce/, hdfs/, and common/ subdirectory.
This is probably the easiest place from which to work, now. The URLs for the
combined SVN trees are:

trunk: https://svn.apache.org/repos/asf/hadoop/common/trunk/
  (this one has the yahoo-merge branches from common, hdfs, and mapred)
MR-279: http://svn.apache.org/repos/asf/hadoop/common/branches/MR-279
  (this one has the yahoo-merge common and hdfs, and the MR-279 mapred)

The same kind of thing happened for HDFS-1073 and branch-0.21-old.
Pre-project-split branches like branch-0.20 should have remained untouched.

You can proceed to delete your checkouts of the individual mapred and hdfs
trees, since they exist within the combined trees above. If for some reason
you prefer to 'svn switch' an old MR or HDFS-specific checkout to point to
its new location, you can use the following incantation:
svn sw $(svn info | grep URL | awk '{print $2}' | sed

=== Git Users ===
The git mirrors of the above 7 branches should now have a set of 4 commits
near the top that look like this:

Merge: 928d485 cd66945 77f628f
Author: Todd Lipcon <todd@apache.org>
Date:   Sun Jun 12 22:53:28 2011 +0000

    HADOOP-7106. Reorganize SVN layout to combine HDFS, Common, and MR in a
single tree (project unsplit)


commit 77f628ff5925c25ba2ee4ce14590789eb2e7b85b
Author: Todd Lipcon <todd@apache.org>
Date:   Sun Jun 12 22:53:27 2011 +0000

    Relocate mapreduce into mapreduce/

commit cd66945f62635f589ff93468e94c0039684a8b6d
Author: Todd Lipcon <todd@apache.org>
Date:   Sun Jun 12 22:53:26 2011 +0000

    Relocate hdfs into hdfs/

commit 928d485e2743115fe37f9d123ce9a635c5afb91a
Author: Todd Lipcon <todd@apache.org>
Date:   Sun Jun 12 22:53:25 2011 +0000

    Relocate common into common/

The first of these 4 is a 3-parent "octopus" merge commit of the
pre-project-unsplit branches. In theory, git is smart enough to track
changes through this merge, so long as you pass the right flags (eg
--follow). For example:

todd@todd-w510:~/git/hadoop-common$ git log --pretty=oneline --abbrev-commit
--follow mapreduce/src/java/org/apache/hadoop/mapred/JobTracker.java | head
77f628f Relocate mapreduce into mapreduce/
90df0cb MAPREDUCE-2455. Remove deprecated JobTracker.State in favour of
ca2aba0 MAPREDUCE-2490. Add logging to graylist and blacklist activity to
aid diagnosis of related issues. Contributed by Jonathan Eagles
32aaa2a MAPREDUCE-2515. MapReduce code references some deprecated options.
Contributed by Ari Rabkin.

If you want to be able to have git follow renames all the way through the
project split back to the beginning of time, put the following in

In terms of rebasing git branches, git is actually pretty smart. For
example, I have a local "HDFS-1073" branch in my hdfs repo. To transition it
to the new combined repo, I did the following:

# Add my project-split hdfs git repo as a remote:
git remote add splithdfs /home/todd/git/hadoop-hdfs/
git fetch splithdfs

# Checkout a branch in my combined repo
git checkout -b HDFS-1073 splithdfs/HDFS-1073

# Rebase it on the combined 1073 branch
git rebase origin/HDFS-1073

...and it actually applies my patches inside the appropriate subdirectory (I
was surprised and impressed by this!)
If the branch you're rebasing has added or moved files, it might not be
smart enough and you'll have to manually rename them in your branch inside
of the appropriate subtree.. but for simple patches this seems to work. For
less simple things, the best bet may be to use "git filter-branch" on the
patch series to relocate it inside a subdirectory, and then try to rebase.
Let me know if you need a hand with any git cleanup, happy to help.

== Outstanding issues ==

The one outstanding issue I'm aware of is that the test-patch builds should
be smart enough to be able to deal with patches that are relative to the
combined root instead of the original project. Right now, if you export a
diff from git, it will include "hdfs/" or "mapreduce/" in the changed file
names, and the QA bot won't know how to apply it. The workaround for this is
to change directory into the relative subproject dir, and then pass
"--relative" to "git diff" or "git show", for example:

todd@todd-w510:~/git/hadoop-common/mapreduce$ git diff --relative
diff --git CHANGES.txt CHANGES.txt

I imagine there are probably some other things that fell through the cracks.
Please get in touch if there's anything that seems amiss.


On Sun, Jun 12, 2011 at 2:50 PM, Todd Lipcon <todd@cloudera.com> wrote:

> All of the nits I ran into should be resolved and we should be good to go.
> I will start this in just about 10 minutes (3pm PST).
> ***Please hold all commits until further notice!*** I anticipate that this
> should take under an hour, but if there are any bumps along the way it might
> stretch into the evening. I'll send out an "all clear" email when things are
> ready to go on the new layout.
> I've disabled all of the Hudson builds for now and will be re-enabling them
> one by one after reconfiguring their SVN URLs.
> -Todd
> On Sat, Jun 11, 2011 at 8:25 PM, Todd Lipcon <todd@cloudera.com> wrote:
>> Hi all,
>> I'm figuring out one more small nit I noticed in my testing this evening.
>> Hopefully I will figure out what's going wrong and be ready to press the big
>> button tomorrow.
>> Assuming I don't have to "abort mission", my hope is to do this at around
>> 3PM PST tomorrow (Sunday). I'll send out a message asking folks to please
>> hold commits to all branches while the move is in progress.
>> Thanks
>> -Todd
>> On Fri, Jun 10, 2011 at 11:20 AM, Todd Lipcon <todd@cloudera.com> wrote:
>>> Hi all,
>>> Pending any unforeseen issues, I am planning on committing HADOOP-7106
>>> this weekend. I have the credentials from Jukka to take care of the git
>>> trees as well, and have done a "practice" move several times on a local
>>> mirror of the svn.
>>> I'll send out an announcement of the exact time in advance of when I
>>> actually do the commit.
>>> Thanks
>>> -Todd
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
> --
> Todd Lipcon
> Software Engineer, Cloudera

Todd Lipcon
Software Engineer, Cloudera

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message