maven-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: [RANT] This Maven thing is killing us....
Date Wed, 05 Jul 2006 10:10:01 GMT
Jason van Zyl wrote:
> 
> On 4 Jul 06, at 1:45 PM 4 Jul 06, Steve Loughran wrote:
>>
>> In a way, many of the stuff in M2 is experimental; a build tool that 
>> effectively encodes beliefs about how a project should be structured 
>> and delivered, focusing on component-based development instead of 
>> application dev. I also think its time to look at how well some of the 
>> experiment is working.
>>
> 
> You make it sound like we're some sort of cult :-)

I think you are exploring cutting edge loosely coupled software 
development processes. It's research. Interesting, fun research, but 
research nonetheless. Just as Gump is an experiment in whether a unified 
nightly build changes people's working processes.

I've been hanging round with semantic-web people recently, and have 
devolved into using the word "belief" where they use "fact", because of 
differences of opinion on what they and I think RDF triples are (they 
think they're facts in a graph, I think every triple is a belief 
published by an entity at a particular moment in time). The nice thing 
about a belief-centric model is you get to accept the fact that 
different entities have different beliefs, and a single entity/agent can 
change its belief set, without ever having to worry about the fact that 
the global belief-set is inconsistent.

in real agent-oriented-runtimes (still very much academic research, even 
more than RDF engines), the resolver takes in to account the metadata 
about which agent issued a belief statement and when during its 
resolution process. Newer statements by the same entity can override 
older ones; differences between entities are allowable but result in 
ambiguities that may need to be dealt with further down the line.

When you apply the same agent-oriented view to POM metadata, you can say 
"a POM file represents the pom author's beliefs about the artifact's 
dependencies at the time they wrote the POM". It may be the beliefs 
match what the artifact really needs, it may be those beliefs turn out 
to be utterly wrong.

[interlude. I just grabbed the chair of the W3C RDF working group by the 
coffee machine. Apparently "a belief is a state of mind", "a fact is 
something that is believed". So all facts are beliefs, the only variable 
being the number of believers]

Because the ibiblio repository contains fact/belief metadata from so 
many sources, its that much harder to reconcile than those from single 
entities. The good news is that we do have a very nice way to test these 
assertions in java; running the program and seeing what classes get 
loaded. So when someone is utterly wrong in their dependencies, its 
pretty obvious. Its when they are slightly wrong, when they use some 
classes in certain cases, often using reflection to bind at run time, 
that you can get caught out.


> 
> The phrase "encoding beliefs" is an inaccurate description. It's is 
> simply the pursuit of best practices for software development and those 
> practices are very much mutable, this thread being very good evidence of 
> that. We also not only focused on component-oriented development, we 
> ourselves develop applications ourselves and we're trying to make that 
> coherent as well.

Ok. how about "encoding the team's ideas and experience in how to build 
applications as sets of components, using
shared repositories to exchange components and their metadata"?

> 
>> Personally, I always experience a bit of fear when adding a new 
>> dependency to a project. the repository stuff, and estimate a couple 
>> of hours to get every addition stable, primarily by building up a good 
>> exclusion list.
> 
> This is the place to talk about that as people shouldn't be fearful 
> adding dependencies. But people who have an ideal setup here they 
> completely control the repository they use internally don't have many of 
> the problems that people are experiencing in this thread. Having a 
> public repository of high quality is not a trivial task.
> 
>>
>> Is it worse than before? Better? Or just, well, different? and if 
>> things are either worse or not as good as they could be, what can be 
>> changed?
>>
> 
> The process is absolutely better. The process couple with the public 
> infrastructure we have now is problematic. Two very different things.
> 
>> One underlying cause seems to be pom quality. Open source software dev 
>> is a vast collection of loosely coupled projects, and what we get in 
>> the repository in terms of metadata matches this model. Each project 
>> produces artifacts that match their immediate needs, with POM files 
>> that appear to work at the time of publishing. Maven then caches those 
>> and freezes that metadata forever, even if it turns out that the 
>> metadata was wrong.  There's far better coherence within Gump, where 
>> the metadata is effectively maintained more by the gump team 
>> themselves than by the individual projects.
> 
> There is absolutely no way this is scalable over time. You are saying 
> that a small group of people can maintain metadata for projects that 
> they are not intimately involved with? That's like saying that people 
> who live outside your community have a better chance at describing your 
> community. I really just don't think that's possible. How many problems 
> has Gump had over the years trying to maintain the metadata? Huge 
> problems, almost never in sync with projects. You basically find out 
> when it breaks and go back track most of the time. There is no doubt 
> that the same process will happen with Maven where users of Maven will 
> eventually make their metadata better but that will take time. Gump has 
> been around for 5-6 years now. People are really only starting to use 
> Maven 2.x which is closing in on being out for a year. I am will to bet 
> in another year a great number of the problems seen in this thread will 
> be gone. I would argue that Gump will not work precisely because it is 
> not the projects themselves maintaining the metadata. Projects using 
> Maven will eventually get it right because it provides some value to 
> them to get it right.


Oh I agree, handwritten custom-coded stuff doesn't scale. That is the 
price with that model, and it makes
it hard to use the same tools within your own build process. But it does 
support the low-hanging-fruit of things that depend on commons-logging 
yet who don't want logkit on their classpath.

Gump's problem is not just that the metadata is written by the gumpers, 
and not the projects, but that the projects don't always care if the 
build is broken. Getting someone to care about what happens to their 
stuff downstream is the first step to fixing the problem. As more m2 
takeup occurs, you should get a lot of that feedback in the system, 
moving from the "please redist on the maven repository", to "please have 
good metadata", before finally, the joy of silence, as everything works.

> 
>>
>> Question is, what to do about it? And if the m2 repository was an 
>> attempt to leave the problems of the M1 repository behind, has it worked?
>>
> 
> To a large extent I would say we have fixed many of the problems on a 
> technical level. Correctly the metadata and educating projects as to how 
> best maintain is it is a social problem and a matter of education. 
> Couple that with some automated integrity checks that will be performed 
> by the repository manager.

Yes, I think more rigorousness on accepting poms would be good. People, 
even apache projects, should not be able to submit an artifact to the 
repository without
-everything you depend on being there. No unresolvable artifacts.
-no dependencies on -SNAPSHOT. I know, apache projects arent meant to 
release in that mode, but Apache Muse managed it, with very bad 
consequences downstream.
-a (manual) review of your dependencies. You, the submitter would get 
told your dependencies; the repository mail list would somehow get a 
submission note that listed the complete depends graph of that component.
-the repository analyzer has some (extensible) rules about generally 
"bad" dependencies, those that should be flagged with a warning. Eg 
junit.jar in the runtime, any of the xml implementations in there 
(rather than just the stax/xml-apis api imports, use of commons-logging 
over commons-logging-api".
-flag appearance of strongly-deprecated versions of things. e.g. 
junit-3.7, anything else that is not in modern use and/or with security 
holes.
-scan the artifacts to see which packages they publish; store a list of 
all classes. Then scan their imports to see what they explicitly import. 
Warn when something they import isnt published by anything they even 
optionally depend upon.

we could have some fun there, given the appropriate amount of spare 
time. I quite like the idea of .class level validation...

-steve


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Mime
View raw message