Mailing-List: contact common-commits-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-dev@hadoop.apache.org
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
From: Apache Wiki <wikidiffs@apache.org>
To: Apache Wiki <wikidiffs@apache.org>
Date: Mon, 07 Dec 2009 16:13:26 -0000
Message-ID: <20091207161326.3604.47637@eos.apache.org>
Subject: 
 =?utf-8?q?=5BHadoop_Wiki=5D_Update_of_=22GitAndHadoop=22_by_SteveLoughran?=

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for ch=
ange notification.

The "GitAndHadoop" page has been changed by SteveLoughran.
The comment on this change is: more on branching.
http://wiki.apache.org/hadoop/GitAndHadoop?action=3Ddiff&rev1=3D4&rev2=3D5

--------------------------------------------------

  =3D=3D Before you begin =3D=3D
  =

   1. You need a copy of git on your system. Some IDEs ship with Git suppor=
t; this page assumes you are using the command line.
-  1.=C2=A0You need a copy of ant 1.7+ on your system for the builds themse=
lves.
+  1.=C2=A0You need a copy of Ant 1.7+ on your system for the builds themse=
lves.
-  1. You need to be online for your first checkout and build.
+  1. You need to be online for your first checkout and build, and any subs=
equent build which needs to download new artifacts from the central JAR rep=
ositories.
   1. You need to set Ant up so that it works with any proxy you have. This=
 is documented by [[http://ant.apache.org/manual/proxy.html |the ant team]].
  =

  =

@@ -35, +35 @@

  }}}
  The total download is well over 100MB, so the initial checkout process wo=
rks best when the network is fast. Once downloaded, Git works offline.
  =

+ =3D=3D Forking onto GitHub =3D=3D
+ =

+ You can create your own fork of the ASF project, put in branches and stuf=
f as you desire. GitHub prefer you to explicitly fork their copies of Hadoo=
p.
+ =

+  1. Create a githib login at http://github.com/ ; Add your public SSH keys
+  1. Go to http://github.com/apache and search for the Hadoop and other ap=
ache projects you want (avro is handy alongside the others)
+  1. For each project, fork. This gives you your own repository URL which =
you can then clone locally with {{{git clone}}}
+  1. For each patch, branch.
+ =

  =3D=3D Building the source =3D=3D
  =

  You need to tell all the Hadoop modules to get a local JAR of the bits of=
 Hadoop they depend on. You do this by making sure your Hadoop version does=
 not match anything public, and to use the "internal" repository of locally=
 published artifacts.
@@ -55, +64 @@

  hadoop-mapred.version=3D${version}
  }}}
  =

+ The {{{resolvers}}} property tells Ivy to look in the local maven artifac=
t repository for versions of the Hadoop artifacts; if you don't set this th=
en only published JARs from the central repostiory will get picked up.
+ =

+ The version property, and descendents, tells Hadoop which version of arti=
facts to create and use. Set this to something different (ideally ahead of)=
 what is being published, to ensure that your own artifacts are picked up.
+ =

  Next, symlink this file to every Hadoop module. Now a change in the file =
gets picked up by all three.
  {{{
- ln -s build.properties hadoop-common/build.properties
- ln -s build.properties hadoop-hdfs/build.properties
- ln -s build.properties hadoop-mapreduce/build.properties
+ pushd hadoop-common; ln -s build.properties ../build.properties; popd
+ pushd hadoop-hdfs; ln -s build.properties ../build.properties; popd
+ pushd hadoop-mapreduce; ln -s build.properties ../build.properties; popd
  }}}
  =

- You are all set up to build.
+ You are now all set up to build.
  =

  =3D=3D=3D Build Hadoop =3D=3D=3D
  =

@@ -72, +85 @@

  =

  This Ant target not only builds the JAR files, it copies it to the local =
{{{${user.home}/.m2}}} directory, where it will be picked up by the "intern=
al" resolver. You can check that this is taking place by running {{{ant ivy=
-report}}} on a project and seeing where it gets its dependencies.
  =

- If there are problems, don't be afraid to {{{rm -rf ~/.m2/repository/org/=
apache/hadoop}}}  and {{{rm -rf ~/.ivy2/cache/org.apache.hadoop}}} to remov=
e local copies of artifacts.
- =

  =3D=3D=3D Testing =3D=3D=3D
  =

  Each project comes with lots of tests; run {{{ant test}}} to run them. If=
 you have made changes to the build and tests fail, it may be that the test=
s never worked on your machine. Build and test the unmodified source first.=
 Then keep an eye on both the main source and any branch you make. A good w=
ay to do this is to give a Continuous Integration server such as Hudson thi=
s job: checking out, building and testing both branches.
  =

+ =3D=3D Branching =3D=3D
+ =

+ Hadoop makes it easy to branch. The recommended process for working with =
apache projects is: one branch per JIRA issue. That makes it easy to isolat=
e development and track the development of each change. It does mean if you=
 have your own branch that you release, one that merges in more than one is=
sue, you have to invest some effort in merging everything in. Try not to ma=
ke changes in different branches that are hard to merge.
+ =

+ One thing you need to look out for is making sure that you are building t=
he different Hadoop projects together; that you have not published on one b=
ranch and built on another. This is because both Ivy and Maven publish arti=
facts to shared repository cache directories.
+ =

+  1. Don't be afraid to {{{rm -rf ~/.m2/repository/org/apache/hadoop}}}  a=
nd {{{rm -rf ~/.ivy2/cache/org.apache.hadoop}}} to remove local copies of a=
rtifacts.
+  1. Use different version properties in different branches to ensure that=
 different versions are not accidentally picked up
+  1. Avoid using {{{latest.version}}} as the version marker in Ivy, as tha=
t gives you the last built.
+  1. Don't build/test different branches simultaneously, such as by runnin=
g Hudson on your local machine while developing on the console. The trick h=
ere is bring up Hudson in a virtual machine, running against the Git reposi=
tory on your desktop. Git lets you do this, which lets you run Hudson again=
st your private branch.
+=20