hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Developing cross-component patches post-split
Date Fri, 03 Jul 2009 15:37:08 GMT
Todd Lipcon wrote:
> On Wed, Jul 1, 2009 at 2:10 PM, Philip Zeyliger <philip@cloudera.com> wrote:
>> -1 to checking in jars.  It's quite a bit of bloat in the repository (which
>> admittedly affects the git.apache folks more than the svn folks), but it's
>> also cumbersome to develop.
>> It'd be nice to have a one-liner that builds the equivalent of the tarball
>> built by "ant binary" in the old world.  When you're working on something
>> that affects both common and hdfs, it'll be pretty painful to make the jars
>> in common, move them over to hdfs, and then compile hdfs.
>> Could the build.xml in hdfs call into common's build.xml and build common
>> as
>> part of building hdfs?  Or perhaps have a separate "top-level" build file
>> that builds everything?
> Agree with Phillip here. Requiring a new jar to be checked in anywhere after
> every common commit seems unscalable and nonperformant. For git users this
> will make the repository size baloon like crazy (the jar is 400KB and we
> have around 5300 commits so far = 2GB!). For svn users it will still mean
> that every "svn update" requires a download of a new jar. Using svn
> externals to manage them also complicates things when trying to work on a
> cross-component patch with two dirty directories - you really need a symlink
> between your working directories rather than through the SVN tree.
> I think it would be reasonable to require that developers check out a
> structure like:
> working-dir/
>   hadoop-common/
>   hadoop-mapred/
>   hadoop-hdfs/
> We can then use relative paths for the mapred->common and hdfs->common
> dependencies. Those who only work on HDFS or only work on mapred will not
> have to check out the other, but everyone will check out common.
> Whether there exists a fourth repository (eg hadoop-build) that has a
> build.xml that ties together the other build.xmls is another open question
> IMO.

1. you can have a build file on top that uses <ivy:buildlist> to create 
a correctly ordered list of child projects

For this to work you need a common set of build file targets (clean, 
release, tested)


2. it's handy to have a target to delete all org.apache.hadoop artifacts 
from wherever ivy is caching them. This lets you be confident that 
nothing out of date is being picked up.

View raw message