hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "HowToContribute" by SteveLoughran
Date Fri, 23 Sep 2011 15:13:08 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "HowToContribute" page has been changed by SteveLoughran:
http://wiki.apache.org/hadoop/HowToContribute?action=diff&rev1=63&rev2=64

Comment:
Lots more setting-up-your machine detail, including protobuf

  = How to Contribute to Hadoop Common =
  This page describes the mechanics of ''how'' to contribute software to Hadoop Common.  For
ideas about ''what'' you might contribute, please see the ProjectSuggestions page.
  
+ 
+ === Setting up ===
+ 
+ Here are some things you will need to build and test Hadoop. It does take some time to set
up a working Hadoop development environment, so be prepared to invest some time. Before you
actually begin trying to code in it, try getting the project to build and test locally first.
This is how you can test your installation.
+ 
+ ==== Software Configuration Management (SCM) ====
+ 
+ The ASF uses Apache Subversion ("SVN") for its SCM system. There are some excellent GUIs
for this, and IDEs with tight SVN integration, but as all our examples are from the command
line, it is convenient to have the command line tools installed and a basic understanding
of them.
+ 
+ A lot of developers now use Git to keep their own (uncommitted-into-apache) code under SCM;
the git command line tools aid with this. See GitAndHadoop for more details.
+ 
+ 
+ ==== Integrated Development Environment (IDE) ====
+ 
+ You are free to use whatever IDE you prefer, or your favourite command line editor. Note
that
+  * Building and testing is often done on the command line, or at least via the Maven and
Ant support in the IDEs.
+  * Set up the IDE to follow the source layout rules of the project.
+  * If you have commit rights to the repository, disable any added value "reformat" and "strip
trailing spaces" features on commits, as it can create extra noise.
+ 
+ ==== Build Tools ====
+ 
+ To build the code, install (as well as the programs needed to run Hadoop on Windows, if
that is your development platform)
+  * [[http://ant.apache.org/|Apache Ant]]
+  * [[http://ant.apache.org/|Apache Maven]]
+ These should be on the path; test by executing {{{ant}}} and {{{mvn}}} respectively.
+ 
+ As the Hadoop builds use the external Maven repository to download artifacts, Ant and Maven
need to be set up with the proxy settings needed to make external HTTP requests. You will
also need to be online for the first builds of every Hadoop project, so that the dependencies
can all be downloaded.
+ 
+ 
+ ==== Other items ===
+ 
+  * A Java Development Kit is required to be installed and on the path of executables. The
Hadoop developers recommend the Sun JDK.
+  * For MapReduce in 0.23+ : [[http://code.google.com/p/protobuf/|protocol buffers]] (see
[[http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/README|YARN
Readme]. The test for this is verifying that {{{protoc}}} is on the command line.
+  * To install and use ProtocolBuffers: A copy of GCC 4.1+. 
+  * The source code of projects that you depend on. Avro, Jetty, Log4J are some examples.
This isn't compulsory, but as the source is there, it helps you see what is going on.
+  * The source code of the Java version that you are using. Again: handy.
+  * The Java API javadocs. 
+  * the {{{diff}}} and {{{patch}}} commands, which ship with Unix/Linux systems, and come
with cygwin. 
+  
+ 
+ ==== Hardware Setup ====
+ 
+  * Lots of RAM, especially if you are using a modern IDE. ECC RAM is recommended in large-RAM
systems. 
+  * Disk Space. Always handy.
+  * Network Connectivity. Hadoop tests are not guaranteed to all work if a machine does not
have a network connection -and especially if it does not know its own name. 
+  * Keep your computer's clock up to date via an NTP server, and set up the time zone correctly.
This is good for avoiding change-log confusion.
+ 
  === Getting the source code ===
  First of all, you need the Hadoop source code. The official location for Hadoop is the Apache
SVN repository; Git is also supported, and useful if you want to make lots of local changes
-and keep those changes under some form or private or public revision control.
  
@@ -18, +65 @@

  svn checkout http://svn.apache.org/repos/asf/hadoop/common/tags/release-X.Y.Z/ hadoop-common-X.Y.Z
  }}}
  If you prefer to use Eclipse for development, there are instructions for setting up SVN
access from within Eclipse at EclipseEnvironment.
+ 
+ '''Committers:''' Check out using https:// URLs instead.
  
  ==== Git Access ====
  See GitAndHadoop

Mime
View raw message