hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "GitAndHadoop" by SteveLoughran
Date Mon, 01 Sep 2014 16:25:34 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "GitAndHadoop" page has been changed by SteveLoughran:

rm all details on building

  This odd syntax says "push nothing to github/mybranch".
- == Building with a Git repository ==
- ''The information below this line is relevant for versions of Hadoop before 0.23.x, and
should be considered obsolute for later versions. It is probably out of date for Hadoop 0.22
as well.''
- == Building the source ==
- You need to tell all the Hadoop modules to get a local JAR of the bits of Hadoop they depend
on. You do this by making sure your Hadoop version does not match anything public, and to
use the "internal" repository of locally published artifacts.
- === Create a build.properties file ===
- Create a {{{build.properties}}} file. Do not do this in the git directories, do it one up.
This is going to be a shared file. This article assumes you are using Linux or a different
Unix, incidentally.
- Make the file something like this:
- {{{
- #this is essential
- resolvers=internal
- #you can increment this number as you see fit
- version=0.22.0-alpha-1
- project.version=${version}
- hadoop.version=${version}
- hadoop-core.version=${version}
- hadoop-hdfs.version=${version}
- hadoop-mapred.version=${version}
- }}}
- The {{{resolvers}}} property tells Ivy to look in the local maven artifact repository for
versions of the Hadoop artifacts; if you don't set this then only published JARs from the
central repostiory will get picked up.
- The version property, and descendents, tells Hadoop which version of artifacts to create
and use. Set this to something different (ideally ahead of) what is being published, to ensure
that your own artifacts are picked up.
- Next, symlink this file to every Hadoop module. Now a change in the file gets picked up
by all three.
- {{{
- pushd common; ln -s ../build.properties build.properties; popd
- pushd hdfs; ln -s ../build.properties build.properties; popd
- pushd mapreduce; ln -s ../build.properties build.properties; popd
- }}}
- You are now all set up to build.
- === Build Hadoop ===
-  1. In {{{common/}}} run {{{ant mvn-install}}}
-  1. In {{{hdfs/}}} run {{{ant mvn-install}}}
-  1. In {{{mapreduce/}}} run {{{ant mvn-install}}}
- This Ant target not only builds the JAR files, it copies it to the local {{{${user.home}/.m2}}}
directory, where it will be picked up by the "internal" resolver. You can check that this
is taking place by running {{{ant ivy-report}}} on a project and seeing where it gets its
- '''Warning:''' it's easy for old JAR versions to get cached and picked up. You will notice
this early if something in hadoop-hdfs or hadoop-mapreduce doesn't compile, but if you are
unlucky things do compile, just not work as your updates are not picked up. Run {{{ant clean-cache}}}
to fix this. 
- By default, the trunk of the HDFS and mapreduce projects are set to grab the snapshot versions
that get built and published into the Apache snapshot repository nightly. While this saves
developers in these projects the complexity of having to build and publish the upstream artifacts
themselves, it doesn't work if you do want to make changes to things like hadoop-common. You
need to make sure the local projects are picking up what's being built locally. 
- To check this in the hadoop-hdfs project, generate the Ivy dependency reports using the
internal resolver:
- {{{
- ant ivy-report -Dresolvers=internal
- }}}
- Then browse to the report page listed at the bottom of the process, switch to the "common"
tab, and look for hadoop-common JAR. It should have a publication timestamp which contains
the date and time of your local build. For example, the string "	20110211174419"> means
the date 2011-02-11 and the time of 17:44:19. If an older version is listed, you probably
have it cached in the ivy cache -you can fix this by removing everything from the org.apache
corner of this cache.
- {{{
- rm -rf ~/.ivy2/cache/org.apache.hadoop
- }}}
- Rerun the {{{ivy-report}}} target and check that the publication date is current to verify
that the version is now up to date.
- === Testing ===
- Each project comes with lots of tests; run {{{ant test}}} to run the all, {{{ant test-core}}}
for the core tests. If you have made changes to the build and tests fail, it may be that the
tests never worked on your machine. Build and test the unmodified source first. Then keep
an eye on both the main source and any branch you make. A good way to do this is to give a
Continuous Integration server such as Hudson this job: checking out, building and testing
both branches.
- Remember, the way Git works, your machine's own repository is something that other machines
can fetch from. So in theory, you could set up a Hudson server on another machine (or VM)
and have it pull and test against your local code. You will need to run it on a separate machine
to avoid your own builds and tests from interfering with the Hudson runs.

View raw message