hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-6671) To use maven for hadoop common builds
Date Thu, 07 Jul 2011 18:52:19 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-6671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Alejandro Abdelnur updated HADOOP-6671:

    Status: Patch Available  (was: Open)

The attached patch works on top of revision 1143624.

This is a complete rewrite of the patch doing things the Maven way. It is much
simpler that my previous patch. There is a single user activated profile used 
to do native compilation. The Maven Ant plugin is used only for things that are 
not possible to do with existing Maven plugins (generate Forrest documentation,
generate the package-info.java, and copying the SO files preserving symlinks).

What is left:

* rewire the the patch & jenkins scripts
* tests that use AOP
* tests that do fault injection
* generating RPM/DEB packages

For the tests I will need some help as I'm completely ignorant about them.

I'll work on the patch & jenkins scripts (Tom, help!! :).

At this point I would like committers to review this and if OK commit it and 
integrate the tests with AOP and fault injection, and RPM/DEB as a follow up JIRAs.

Patching instructions:

(I've tested it both out of a GIT checkout and a SVN checkout)

* Run the 'mvn-layout-k.sh' script. If using a SVN checkout use the 'svn' parameter 
  (it will do SVN file operations). If using a GIT checkout use 'fs' parameter 
  (it will do FS file operations).
* Apply the 'HADOOP-6671-k.patch' patch.


The trunk has a pom.xml file, at the moment is only wired to build Hadoop common.
Later hdfs, mapred, contribs will be wired here.

The trunk/root module contains the foundation pom.xml inherited by all Hadoop
modules defining versions of plugins and dependencies as well as common properties.

The trunk/doclet module builds the Hadoop doclet used when generating Javadocs 
and Jdiff reports. hdfs and mapred will also use this doclet.

The trunk/common-main module is Hadoop common proper. It contains 3 sub-modules:
trunk/common-main/common, trunk/common-main/docs, trunk/common-main/distro.

The trunk/common-main/common module generates Hadoop Common JAR.

The trunk/common-main/docs module generates Hadoop Common Forrest documentation.

The trunk/common-main/distro module generates Hadoop Common tarball.

The reason for renaming trunk/common-main to trunk/common was that having to
nested modules called common (trunk/common/common) would be confusing. And I
didn't want to change the name of the later to something else as it is the
one generating the hadoop-common JAR.

For the RPM/DEB packages the idea is use alternate assembly descriptors in the 
trunk/common-main/distro module that will generate the layout required by the
packaging tools. 

There are Maven plugins that invoke RPM/DEB tools. We'll have to decide if we
use those plugins or we just jump out of Maven to the RPM/DEB generation. I'll
work with Giri to decide this as he did the original RPM/DEB generation.


* Unix System
* JDK 1.6
* Maven 3.0
* Forrest 0.8 (if generating docs)
* Findbugs 1.3.9 (if running findbugs)
* Autotools (if compiling native code)
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)

Maven modules:

  hadoop (Main Hadoop project)
         - root (boostrap module with the parent pom inherited by all modules)
                (all plugins & dependencies versions are defined here        )
         - doclet (generates the Hadoop doclet used to generated the Javadocs)
         - common-main (Hadoop common Main)
                       - common (Java & Native code)
                       - docs   (documentation)
                       - distro (creates TAR)

Where to run Maven from?

  It can be run from any module. The only catch is that if not run from utrunk
  all modules that are not part of the build run must be installed in the local
  Maven cache or available in a Maven repository.

Maven build goals:

 * Clean                     : mvn clean
 * Compile                   : mvn compile
 * Run tests                 : mvn test
 * Create JAR                : mvn package
 * Run findbugs              : mvn compile findbugs:findbugs
 * Run checkstyle            : mvn checkstyle:checkstyle
 * Install JAR in M2 cache   : mvn install
 * Deploy JAR to Maven repo  : mvn deploy
 * Run clover                : mvn clover:clover
 * Run Rat                   : mvn apache-rat:check
 * Build documentation       : mvn package site
 * Build TAR                 : mvn package post-site (*)

 Build options:

  * Use -Pnative  to compile/bundle native code
  * Use -DskipTests to skip tests when running the following Maven goals:
    'package',  'install'  or 'deploy'
  * Use -Dsnappy.prefix=(/usr/local) & -Dbundle.snappy=(false) to compile
    Snappy JNI bindings and to bundle Snappy SO files

 Tests options:

  * -Dtest=<TESTCLASSNAME>,....
  * -Dtest.exclude=<TESTCLASSNAME>
  * -Dtest.exclude.pattern=**/<TESTCLASSNAME1>.java,**/<TESTCLASSNAME2>.java

[* piggybacking on post-site lifecycle phase to wire the Ant & Assembly plugins, ]
[  in that order, to generate a TAR with symlinks. The issue is that the assembly]
[  plugin on its own is not wired to a lifecycle where we could hook the Ant     ]
[  the ant plugin to copy the SO files preserving symlinks (Assembly plugin does ]
[  not handle symlinks.                                                          ]

> To use maven for hadoop common builds
> -------------------------------------
>                 Key: HADOOP-6671
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6671
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: build
>    Affects Versions: 0.22.0
>            Reporter: Giridharan Kesavan
>            Assignee: Alejandro Abdelnur
>         Attachments: HADOOP-6671-cross-project-HDFS.patch, HADOOP-6671-e.patch, HADOOP-6671-f.patch,
HADOOP-6671-g.patch, HADOOP-6671-h.patch, HADOOP-6671-i.patch, HADOOP-6671-j.patch, HADOOP-6671-k.sh,
HADOOP-6671.patch, HADOOP-6671b.patch, HADOOP-6671c.patch, HADOOP-6671d.patch, build.png,
common-mvn-layout-i.sh, hadoop-commons-maven.patch, mvn-layout-e.sh, mvn-layout-f.sh, mvn-layout-k.sh,
mvn-layout.sh, mvn-layout.sh, mvn-layout2.sh, mvn-layout2.sh
> We are now able to publish hadoop artifacts to the maven repo successfully [ Hadoop-6382]
> Drawbacks with the current approach:
> * Use ivy for dependency management with ivy.xml
> * Use maven-ant-task for artifact publishing to the maven repository
> * pom files are not generated dynamically 
> To address this I propose we use maven to build hadoop-common, which would help us to
manage dependencies, publish artifacts and have one single xml file(POM) for dependency management
and artifact publishing.
> I would like to have a branch created to work on mavenizing  hadoop common.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message