Mailing-List: contact harmony-dev-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: harmony-dev@incubator.apache.org
Received-SPF: neutral (asf.osuosl.org: 195.212.29.134 is neither permitted nor
 denied by domain of mark.hindess@googlemail.com)
Message-Id: <200605112001.k4BK180v005346@d06av02.portsmouth.uk.ibm.com>
From: Mark Hindess <mark.hindess@googlemail.com>
To: harmony-dev@incubator.apache.org, geir@pobox.com
Subject: Re: Supporting working on a single module? 
In-reply-to: Your message of "Wed, 10 May 2006 08:07:28 EDT."
             <4461D780.2030708@pobox.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Thu, 11 May 2006 21:01:08 +0100


On 10 May 2006 at 8:07, Geir Magnusson Jr <geir@pobox.com> wrote:
> Mark Hindess wrote:
> > On 9 May 2006 at 10:32, Geir Magnusson Jr <geir@pobox.com> wrote:
> >> Mark Hindess wrote:
> >>> As the Harmony Classlib grows, I think that being able to work on a
> >>> single module (or some subset of the modules) will become the
> >>> typical way of working for many (perhaps even most) contributors.
> >> Sure, that makes sense.
> >>
> >>> So I think we need a plan to support this.  I also think that
> >>> forcing ourselves to support this mode of working sooner rather than
> >>> later will help us keep from accidentally breaking the modularity in
> >>> the current build/development process.
> >> Can you not work on a single module now?
> > 
> > Yes.  But only by checking out everything.
> 
> Wouldn't you do that anyway?
> 
> I guess thats the mystery to me.  I figured that someone would do the 
> full checkout, and go from there.

Well, IIRC we have at least one potential developer on the list who
hasn't been able to do that.  (Aside: what happened about making a
source snapshot?)

> [ SNIP ]
>
> > I don't think it's a good idea for one module referencing into another
> > directly where it isn't a well-defined interface that we can
> > manage/enforce appropriately.  Would you agree?
> 
> But I don't see a solution here, yet.  Just balling things up into a 
> "HDK" doesn't seem to fix it.

Well, we copy things into deploy/jre and thereby enable modular
development of java code.  Why not do the same for C header files?

Admittedly we don't actually take modular development of the java
code as far as I'm suggesting but I think we should.

> > We might as well try to agree how to construct and manage this
> > "interface".  I think the copying/"hdk" idea is a good solution.
> 
> I guess I don't grok how the copying solves the root problem of 
> coupling, because the cross-coupling due to C/C++ is still there.

(Not sure how it's "due to C/C++" since it's arguably worse for the java
case, we just don't notice because we already copy the jars and added
them to the (boot)classpath to enable compilation in the modules.)

> I certainly can see that the copying is useful - for a given platform 
> target, you can copy the stuff to the top, I guess.
> 
> But I can also see having a "top" level per module, for which the target 
> platform is copied into from the bowels of the module...

I was thinking of the copying as being a statement that these header
files (out of all those that might be used within a module) form the
"public" interface for this module (public w.r.t. other modules at
least).

You are correct that we could copy them to the top of each module, but
I think copying them to one top location is more consistent with the
way we copy the boot jars to one location.
 
> That seems cleaner and keeps separate "namespaces".

When we combine the boot jars into one location, I don't think there is
any difficulty in understanding which jar belongs to which module.  I
don't see how this is much different so long as we are thoughtful about
naming the header files.  I don't think there are enough to worry about
it yet, but we could split the include directory with per-module
subdirectories if things start to get out of hand.

Putting them at the top-level makes it easier to do the "balling up" to
create an compile-against-hdk.  But I also think it is more consistent.
When doing the javac in a module, we don't reference dependent modules
directly, we just point at the complete collection of boot jars.


I also think this way makes it easier to have identical
build mechanisms for both a check-out-everything-build and a
check-out-one-module-and-untar-hdk build since at the module level we
don't need to worry about whether the deploy/hdk tree was populated by
simple copying or by untar'ing an hdk.

> > I didn't imagine these issues would be this contentious or perhaps I'd
> > have tried to separate these two (related) issues.
> 
> That may be the problem here - I'm confusing the two issues?

I still seem to be having trouble separating them, but I think that is
because I'm in favour of supporting the one-module-hdk style of working
from being possible so that makes me lean towards a solution that
supports this.  Where as, you aren't constrained by this (which I think
this is a good thing since it makes for a more thorough discussion of 
the issues).

> > 
> >>> This means we can then create a snapshot tar/zip of the common
> >>> include directory which someone wishing to work on a single module
> >>> could download to build against.  This is not dissimilar to the
> >>> current way in which you could build the java sources against a
> >>> deploy/jre snapshot.
> > 
> >> Why wouldn't someone wishing to work on a single module just checkout
> >> all the code?
> > 
> > So someone working on prefs (which is approximately 2MB) would need to
> > check out all the source for luni (the current largest module at ~36MB),
> > awt, swing, sound, etc. ?
> 
> Yes.  I actually think they would if it builds.  It appears to me that 
> the classlibrary is intercoupled - our modules are an unnatural (to the 
> status quo) segmentation we've placed on top in our quest for something 
> better.
> 
> I guess I need to re-read and figure out how this would solve that 
> problem.  It seems to just add to the number of moving parts.

I still think it is quite consistent with how we have solved the problem
with respect to java code.  It just happens that with the java code the
jre is a natural point for combining the artifacts and there isn't one
for the "hdk".

> > I usually have half a dozen workspaces where I'm trying things out (even
> > more at the moment since I'm looking at the four contributions that are
> > being voted on).  This isn't too bad at the moment with each one being
> > about 1/4 GB but it will get bigger over time and therefore less
> > manageable.
> 
> Aha...
> > 
> >> I'm really wary about having non-SVN-sourced artifacts meant to be
> >> used in building/development.
> > 
> > Isn't that modularity?  Why shouldn't we do it - our customers will be?
> 
> Because our users will be using released things.  Things that are 
> stable.  A "*DK" by definition is stable, stable code, stable 
> interfaces, etc.  (ok, slow moving, really, but you know what I mean...)

I was more thinking of the customers that might take Harmony and replace
modules.  For instance, the makers of an SSL Accelerator Card might
replace the security module with a specialised one.

> > 
> >> Smells a bit like the first step towards the classic Windows
> >> "dll-hell".
> > 
> > Undoubtedly, with people working on separate modules, we will get build
> > breaks.  But we'll get them when, for example, we don't have sufficient
> > unit test coverage within the module being worked on - we had an example
> > of this not long ago IIRC.  We'll also have breaks when people have made
> > different assumptions about the meaning of the spec or the definition of
> > the internal API.  It is a good thing that we find these *bugs* within
> > the project... being our own customers!
> 
> I don't understand how a "hdk" helps find the bugs.

I was thinking of incidents like this where we discovered (I think) that
the coverage of the text tests was insufficient:

  http://mail-archives.apache.org/mod_mbox/incubator-harmony-dev/200604.mbox/%3c443B8A71.8080804@googlemail.com%3e

If people work on individual modules this kind of thing might happen
more often but I think it is useful to discover bugs like this early.
(Having said that we can/should include the tests from all modules in
the hdk - I don't think they belong in the jdk or jre - so that single
module developers can still have the option of running more extensive
tests.)

> > I think this will actually help *avoid* some of the problems you are
> > thinking about.
> > 
> >>> For windows, the snapshot would also include the .lib files that are
> >>> required to build for the run-time .dll files.
> >>>
> >>> What is this new snapshot?  Well, Tim suggested 
> >> (Where?)
> >>
> >>> that it was a little like the jdk but for Harmony development.  So
> >>> perhaps it should be called an hdk, Harmony Development Kit?
> >> I'm missing the point... why wouldn't a developer checkout head, build
> >> that, and then get to work on whatever part of whatever module they
> >> wanted?
> > 
> > The classlib is getting pretty big.  If we are serious about modularity,
> > then we should try to support it right through from build, development,
> > deployment and ultimately at runtime.
> 
> I think that we're serious about modularity as a packaging option (our 
> primary one) and something we're going to push for in the ecosystem. 
> However, I'm afraid of letting the tail wag the dog here...

I thought the intentions of the project with respect to modularity were
a little broader than just packaging.

> [SNIP]
> 
> > 
> >> Is it that since modules reference other modules, working in one
> >> module means you need the dependent modules around?
> > 
> > Yes, exactly.  We shouldn't require users to have the entire "source"
> > for all the dependent modules around. 
> 
> Are you accidentally conflating "user" and "developer" here?    They are 
> entirely different roles.

Yes, I was.  Sorry.  And yes, I appreciate just how different they can
be! ;-)

> Also, I certainly can see the theoretical value of this, but we tried 
> similar things in Geronimo, and it was, well, hell, because of the 
> coupling between Geronimo and it's dependencies.  eventually, we gave up 
> and had an 'uberbuild', which wasn't so bad - you'd checkout everything, 
> do the master build, and then go dive into your module, which is what I 
> think talked about doing here before.
> 
> So I guess I'm just resistive due to painful experience on this.

It's hard to comment without understanding the specifics.

> > We should support modularity at the development level.  (Like we do
> > already with the stubs for the luni/security kernel classes.  We
> > don't require developers to have the VM around.)
> > 
> > When I'm developing C code I reference the libc header files but I don't
> > go poking around including random headers from the libc source.
> > 
> > So if I'm working on sql, I don't see why I shouldn't develop using
> > header files and other well-defined parts of the API for other modules
> > rather than having to have all of the source code checked out.
> 
> Because I think version problems are going to spiral out of control.
>
> > Of course, we should still support people working by checking out
> > everything but we shouldn't require it.
> 
> This has to be the case.  And IMO, whether you did something like "svn 
> co; build top" which fills in the "top-level header directory" (which I 
> think might be better per module), builds all jars, test jars, and libs, 
> and puts everything in place so you can go wander off and work in a 
> module, or do a lightweight checkout, stuff the hdk in a specific place 
> in your classlib/trunk tree, apart for versions, the situation must be 
> *identical*, or you are going to run into all sorts of build and 
> debugging/discussion hell.

Yes.  I agree it must be identical.

> >> A long time ago we talked about something like this, a pre-build step
> >> that produces a "hdk"-like set of artifacts that a developer could
> >> then use if they were focused down inside a module.
> >>
> >> Is this the same thing returning for discussion?
> > 
> > Not since I've been actively following the list but I'll dig about in
> > the archives later.
> >  
> >> Couldn't we just use a standard snapshot?  Take the latest nightly,
> >> drop into some magical place in the tree, and then go into your module
> >> and work from there?
> > 
> > Well, I was suggesting snapshots that might be:
> > 
> > 1) hdk (inc. jdk and jre)
> > 2) jdk (inc. jre)
> > 3) jre
> > 
> > but I think (?) you are suggesting using the one I overlooked:
> > 
> > 0) everything in classlib/trunk
> > 
> > I think 0) is going to get pretty big (but we should still create it)
> > and I think we should actively support using 1) for development too.
> > 
> > Wasn't someone recently pleading with Eclipse to make smaller artifacts?
> > ;-)
> 
> That's a totally different problem.
> 
> In the end, I don't care if we snapshot things like this to make it 
> easier, but I'm really worried about what this will become.
> 
> Also, I think it would be better to be clear about the issues in here.
> 
> So assuming I understand the issues, I'm not against this as long as the 
> world is indistinguishable if I do a svn co and a "make the world" 
> top-level build, or just checkout a module and drop in a hdk above it in 
> the tree.
> 
> AS a matter of fact, I think that the hdk is simple a tar of the junk 
> created by a "make the world" build...

I wouldn't have put it quite like that, but yes perhaps that is correct.
I'm saying that deploy/hdk tree should be created/used by a "make the
world" build in such a way that compiling a module would be the same as
if you have just untar'd an hdk snapshot.  This is possible/easy if no
module directly references another except via the "junk" in the hdk
(specifically the jre/lib/boot jars for java code).

> Also, for you, with multiple workspaces, I would imagine that your life 
> would be better with this being resolvable via a "pointer" in the build 
> properties (which defaults to "." or -ish), so you can have both a full 
> tree around, as well as one or more hdk snapshots.
> 
> The following is an example of having the full tree ('full_checkout') at 
> the same time as a hdk ('binary snapshot aka hdk') with three work areas 
> (just the prefs checked out from SVN, the RMI contribution from Intel 
> and the RMI contribution from ITC)
> 
> /your_see_drive :)
>     /full_checkout/
>            /deploy/
>                 artifacts
>            / whatever/
>     /binary snapshot aka hdk/
>            /deploy/
>                  artifacts
>     /checkout_of_pref_only/
>           build.props (points to /binary snapshot aka hdk/)
>           /modules/
>               /pref/
>     /contribution_1_from_SVN/
>           build.props (points to /full checkout/)
>            /modules/
>                 /RMI from Intel/
>     /contribution_2_from_SVN/
>           build.props (points to /full checkout/)
>            /modules/
>                  /RMI_from_ITC/
> 
> 
> That way, you can dork w/ the full_checkout, fix something, and then 
> your other work environments that are pointing at it get that fix w/o 
> any work.
> 
> I hope my crude art makes sense.

Yes.  But I'm not sure why I'd need the full_checkout around if I had
the hdk snapshot though.  And I'm not sure I'd mind having several hdk's
around; They'd be a more manageable than the code I currently copy
around.

Regards,
 Mark.


---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org