hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rottinghuis, Joep" <jrottingh...@ebay.com>
Subject RE: follow up Hadoop mavenization work
Date Fri, 29 Jul 2011 02:10:22 GMT
Alejandro,

Are you trying the use-case when people will want to locally build a consistent set of common,
hdfs, and mapreduce without the downstream projects depending on published Maven SNAPSHOTS?
I'm working to get this going on 0.22 right now (see HDFS-843, HDFS-2214, and I'll have to
file two equivalent bugs on mapreduce).

Part of the problem is that the assumption was that people always compile hdfs against hadoop-common-0.xyz-SNAPSHOT.
When applying one patch at a time from Jira attachments that may be fine.

If I set up a Jenkins build I will want to make sure that first hadoop-common builds with
a new build number (not snapshot), then hdfs against that same build number, then mapreduce
against hadoop-common and hdfs.
Otherwise you can get a situation when the mapreduce build is still running and hadoop-common
build has already produced a new snapshot build.

Local caching in ~/.m2 and ~/.ivy2 repos makes this situation even more complex.

Having the ability to build without Internet connectivity is not just for laptops on the go.
For corporate environments one may not want to have a build server have Internet connectivity.
In that case should one do a build on a machine with connectivity first and then fork-lift
the ~/.m2/repository directory over?
Should any hadoop-common, hadoop-hdfs and hadoop-mapreduce artifacts be purged in that case
(since they should be rebuilt locally)?

Thanks,

Joep

-----Original Message-----
From: Alejandro Abdelnur [mailto:tucu@cloudera.com] 
Sent: Thursday, July 28, 2011 4:41 PM
To: general@hadoop.apache.org
Subject: follow up Hadoop mavenization work

Following up with Hadoop Common mavenization (HADOOP-6671) I've just posted a patch for HDFS
mavenization (HDFS-2096)

The HADOOP-6671 patch integrates all feedback received in the JIRA and, IMO, it is ready for
prime time.

In order not break HDFS and MAPRED which are still Ant based, there are 2 patches HDFS-2196
& MAPREDUCE-2741that make some corrections in the ivy configuration to work correctly
with the Hadoop common JAR (build/published by Mavenized build).

HDFS-2096 is not 100% ready, some testcases are failing and native code testing is not wired,
but everything else (compile, test, package, tar, binary, jdiff, etc is wired).

* https://issues.apache.org/jira/browse/HADOOP-6671
* https://issues.apache.org/jira/browse/HDFS-2196
* https://issues.apache.org/jira/browse/MAPREDUCE-2741
* https://issues.apache.org/jira/browse/HDFS-2096

I know these are big changes and we'll have some hiccups, but the benefits are big (running
testcases is faster, it easily works from IDEs, Maven build system can easily be understood
by anybody that knows Maven).

Keeping the patches current is time-consuming, because of this, it would be great if we can
get in the ones ready (HADOOP-6671, HDFS-2196,
MAPREDUCE-2741) so we can focus on the rest of the Mavenization work.

Thanks.

Alejandro

Mime
View raw message