hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: Developing cross-component patches post-split
Date Thu, 02 Jul 2009 17:02:09 GMT

On 7/1/09 11:58 PM, "Nigel Daley" <ndaley@yahoo-inc.com> wrote:

> On Jul 1, 2009, at 10:16 PM, Todd Lipcon wrote:
>> On Wed, Jul 1, 2009 at 10:10 PM, Raghu Angadi <rangadi@yahoo-
>> inc.com> wrote:
>>> -1 for committing the jar.
>>> Most of the various options proposed sound certainly better.
>>> Can build.xml be updated such that Ivy fetches recent (nightly)
>>> build?
> +1.  Using ant command line parameters for Ivy, the hdfs and mapreduce
> builds can depend on the latest Common build from one of:
> a) a local filesystem ivy repo/directory (ie. a developer build of
> Common that is published automatically to local fs ivy directory)
> b) a maven repo (ie. a stable published signed release of Common)
> c) a URL

The standard approach to this problem is the above -- a local file system
repository, with local developer build output, and a shared repository with
build-system blessed content.
A developer can choose which to use based on their needs.

For ease of use, there is always a way to trigger the dependency chain for a
"full" build.  Typically with Java this is a master ant script or a maven
POM.  The build system must either know to build all at once with the proper
dependency order, or versions are decoupled and dependency changes happen
only when manually triggered (e.g. Hdfs at revision 9999 uses common 9000,
and then a check-in pushes hdfs 10000 to use a new common version).
Checking in Jars is usually very frowned upon.  Rather, metadata is checked
in -- the revision number and branch that can create the jar, and the jar
can be fetched from a repository or built with that metadata.

AFAICS those are the only two options -- tight coupling, or strict
separation.  The latter means that changes to common aren't picked up by
hdfs or mpareduce until the dependent version is incremented in the metadata
(harder and more restrictive to devs), and the former means that all are
essentially the same coupled version (more complicated on the build system
side but easy for devs).
Developers can span both worlds, but the build system has to pick only one.

> Option c can be a stable URL to that last successful Hudson build and
> is in fact what all the Hudson hdfs and mapreduce builds could be
> configured to use.  An example URL would be something like:
> http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/lastSuccessfulBu
> ild/artifact/
> ...
> Giri is creating a patch for this and will respond with more insight
> on how this might work.
>> This seems slightly better than actually committing the jars.
>> However, what
>> should we do when the nightly build has failed hudson tests? We seem
>> to
>> sometimes go weeks at a time without a "green" build out of Hudson.
> Hudson creates a "lastSuccessfulBuild" link that should be used in
> most cases (see my example above).  If Common builds are failing we
> need to respond immediately.  Same for other sub-projects.  We've got
> to drop this culture that allows failing/flaky unit tests to persist.
>>> HDFS could have a build target that builds common jar from a
>>> specified
>>> source location for common.
>> This is still my preffered option. Whether it does this with a
>> <javac> task
>> or with some kind of <subant> or even <exec>, I think having the
>> source
>> trees "loosely" tied together for developers is a must.
> -1.  If folks really want this, then let's revert the project split. :-o
> Nige

View raw message