xmlgraphics-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremias Maerki <dev.jerem...@greenmail.ch>
Subject Re: XML Graphics Commons: last call
Date Thu, 11 Aug 2005 08:30:25 GMT

On 11.08.2005 02:34:35 Thomas DeWeese wrote:
> Jeremias Maerki wrote:
> > On 09.08.2005 13:12:18 Thomas DeWeese wrote:
> 
> 
> >>>On 08.08.2005 22:31:09 J.Pietschmann wrote:
> >>
> >>>>2. Seeing Avalon as dependency makes me uneasy. I'd prefer
> >>>>commons code to be refactored to not using Avalon and to
> >>>>provide raw functionality only.
> >>
> >>    So I am a bit concerned about the addition of three new
> >>external dependencies for Batik.  I realize that some of it is
> >>already there but hidden in the PDF transcoder.
> >>
> >>Jeremias Maerki wrote:
> >>
> >>>Everybody always complains about that little library. What's the problem?
> >>>We're not going to do Avalonization anymore. But the configuration
> >>>subpackage in A-F is a perfect (!) little tool and a LOT smaller than
> >>>Jakarta Commons Configuration. 
> >>
> >>    What are we configuring?
> > 
> > Fonts and optional PDF-fine-tuning, mostly. Not everything that could
> > also be configured in FOP 0.20.5 is already implemented in the trunk.
> 
>     Ok so I took some time and I really don't like this model.
> This is exactly the sort of thing that belongs at the application
> layer not at the library layer.  The idea of 'passing data' around
> the application through a config file seems really bad to me.
> (it's an excellent example of a programmers hack, look at all the
> cool stuff I can do, so I can also break things in really odd
> unpredictable ways - that's a small price to pay for tweakability).

Ok, we disagree then. It probably doesn't help either to say that
certain things in Batik similar to that topic make my head spin, too.

>     The problem is that I'm sure this is exactly the sort of
> thing that sparked the flame wars.  I also am not going to even
> come close to have the time or interest to provide a replacement.

There you got it wrong. It was actually exactly the sort of thing we do
here as a whole: Componentizing a complex system. People mostly didn't
understand Avalon at the time I introduced it but as far as I can
remember we didn't have major flamewars because of that. I even got most
of the people convinced that the Avalonization was a good thing.

Fact is that both Batik and FOP are stable products, having lost their
hype-factor, and they are too small or too specialized to attract many
corporate supporters. We have to fight to make us attractive. It must be
as easy as possible for newbies to jump in. So we have to find solutions
for that. Somehow. Both FOP and Batik IMHO suffer from the "big blob of
code" problem. I don't know the ideal solution to that problem, yet.
Good ideas are always welcome.

> >>>If anything, those who have a problem
> >>>with A-F should propose a viable alternative (I haven't seen one, yet)
> >>>or work with the Excalibur guys to extract the configuration subpackage
> >>>into a stand-alone lib.
> >>
> >>    Since I have no idea what the library is and/or what you
> >>plan to use it for I can't really provide alternatives.
> 
>     This should be done the old fashioned way, the library should export
> interfaces to support configuration and the _application_ should
> provide the configuration data to the library as appropriate for
> the application.

That's exactly what's being done, just with the help of an existing and
stable little library instead of having to come up with yet a new way of
configuring a system (userconfig.xml with SAX handler in FOP 0.20.5,
key-controlled property system in Trancoders in Batik, etc. etc.).

> I think this is really trying to do an end run
> around this.  There is no clear description of the data that passes
> across this interface - that's really ugly IMO.

That's mostly because nobody's done the documentation, yet.

> > But A-F contains a configuration subpackage [2] which is very small,
> > very easy to use and provides exactly what is necessary to load a
> > configuration tree from an XML file and to access values in a type-safe
> > and fail-safe manner while not having to deal with the deficiencies of a
> > DOM.
> 
>     The problem is that I don't care how big or small a dependency is
> once the dependency is there it causes problems.

Dependency: The great FUD-word. Sorry for being sarcastic.
pdf-transcoder.jar contains commons-logging, commons-io and
avalon-framework repackaged in the JAR. Batik didn't seem to have a
problem with that until now. And that's the whole point: People who have
problems with additional JARs (and that's probably the main problem) can
get a repackaged JAR if need be. That's what I've done for
pdf-transcoder.jar. That's what Xalan-J did in their latest release,
they repackaged Apache BCEL and Apache Regexp in their xalan.jar.
Dependencies are nothing more than Java classes that are not directly
coming from the same project. Like we in FOP depend on Batik and have
been bitten a number of times by interface changes. Things like that
happen. That's why people and projects should talk to each other. That's
partly why I suggested FOP and Batik to go together under a common
umbrella, so we can be closer together. So far, this seems to work well. But
the whole thing is also about reusing code. It's a big possibility to
avoid bugs and to speed up development, not having to reinvent the wheel
each time. That is also one of the reasons behind XML Graphics Commons.
FOP could have continued with its own SVG implementation. But that would
have made no sense. FOP is extremely happy for the dependency on Batik
because of the benefits!

Granted, Avalon Framework obviously faces too much resistance and in the
light of that needs to be taken out, at least out of certain code parts.
I can do that if need be. Reluctantly, but I can do it. I don't think
anybody else is ready to volunteer for that ATM, or am I wrong?

> >>    I'm also not thrilled with adding lots of logging dependencies. 
> >>Logging makes sense for 'non-interactive' contexts but is not really
> >>what you want for most interactive contexts.
> > 
> > AFAICS there is exactly one dependency on Commons Logging. Avalon
> > Logging is not used anymore.
> 
>     Once again it doesn't matter how big or small the dependency is
> a dependency is a dependency.  Also the lots of logging dependencies
> come from the pervasiveness of logging.  Once you are in it's hard
> to change.

We've changed from Avalon-style logging to Commons Logging and it wasn't
that hard. It's much less of a problem than constantly running after
stray System.outs. It's simply important to know how logging should be
used. It took us a little time in FOP to find that out, but we should be
fine, now.

>     So what is common's I/O used for?  I assumed it was used for
> Logging.

No. Commons IO is a small library containing low-level helpers for
recurring tasks such as copying stream (which everybody otherwise have
to reinvent each time, with bugs, of course), loading streams into byte
arrays, dealing with file names. It contains helpful Stream and Reader
implementations, for example, a memory-optimized version of
ByteArrayOutputStream. Javadocs here to get an idea:
http://jakarta.apache.org/commons/io/apidocs/index.html

Here's a list of Apache projects using Commons IO (scroll to "Dependees"):
http://vmgump.apache.org/gump/public/jakarta-commons/commons-io/details.html
(The dependecy of Commons IO on Commons Lang in the Gump descriptor is
bogus BTW. Commons IO has no runtime dependencies.)

Want to see the dependees list for Commons Logging? :-)
http://vmgump.apache.org/gump/public/jakarta-commons/commons-logging/details.html

> > I don't see how logging disturbs operation in interactive contexts. 
> 
>     It doesn't necessarily disturb it but it doesn't help either.

It can help the developer. Logging is easily turned off entirely for an
interactive application. Almost no performance penalty for logging if a
few very simple rules are followed.

> > Do you prefer logging by System.out? Or can you do without logging? 
> 
>     I add/uncomment system.err messages (and often make use
> of 'new Error("blah").printStackTrace()') as needed to track problems.
> In the past I've developed elaborate logging systems (mostly C/C++)
> and they can be useful for situations where build times are long
> or you need to support remote clients.  But Java build times are tiny
> and end users can access our source.

Yes, in C/C++ you can use macros and all that ugly stuff to filter out
logging code at build time if you don't want it. No go for Java. But
adding and removing System.err/outs all the time is not really a good
solution. There are a few things you can't do with them:
- You can't turn on logging through a configuration file in a production
environment to investigate strange behaviour in a server application.
- You can't just tell people with a problem to specify "-d" on the
command-line do we get more helpful information on what is probably
going on inside the code. This makes helping people a lot easier.
- Switching on a certain group of logging statements for a special task
at hand while developing. It makes me so much more productive.

> > I learned that I can't. Logging is a very powerful means while
> > debugging (dev- and deployment-time).
> 
>     As I said logging can be useful in non-interactive circumstances,
> but in interactive situations you need to provide feedback through
> the GUI. In my experience logging is useless for this - the type
> and style of messages that goes into logs is very different from
> the GUI.

Exactly. That's a point that is on my task list for FOP. Logging is for
developers, end-users in an interactive environment need different
feedback. So we finally agree on something. I'm a little happier now.

> Now there are potentially non-interactive applications
> for Batik components (the transcoders mostly) but right now they
> handle logging/notification through the UserAgent classes.

Yes, but the name "UserAgent" implies that this is used for interaction
with the user. Who interacts with the developer, giving him the
information he needs?

> >>---
> >>
> >>Other comments:
> >>
> >>    You should probably add 'SVGGraphics2D' to the java2D tree,
> >>and possible components coming from Batik.
> > 
> > 
> > Done. Remember, it's a Wiki! And we get change logs on
> > commits@xmlgraphics.apache.org. :-)
> > 
> > 
> >>    I'm not sure what the intent of the code structure is.  It
> >>looks like each of the 'parts' is intended to be an independent
> >>'project' (i.e. separate source tree's).  Would it make more
> >>sense to just have one source tree and separate based on
> >>packages (this is what batik does to build it's sub jars)?
> > 
> > That's one possibility but people have a hard time understanding what is
> > in which of the batik JARs. 
> 
>     I'm not sure that splitting the code up helps here.  I think it
> makes the problem worse (which tree is the stuff I need in?).

Ok, as I said, I'm not too sure about the benefits, but I don't think
it makes the problem worse, either. I'd appreciate additional opinions
here. But then, no opposition from me putting this in a combined directory
structure. I like the separation because it forces you to think a little
more about the dependencies (package dependencies here, not JAR
dependencies) while coding. It makes for more reusable and more cleanly
designed components.

> > Like FOP, Batik is a big blob of code. What I intend with the 
> > split is to make it easier and (more importantly) less overwhelming 
> > for newbies to start working on certain parts. FOP is scary
> > and so is Batik. Experience shows that despite a clean package structure,
> > people still don't find their way. I don't say that my proposal will
> > definitely solve that problem but I suspect it might help. At any rate,
> > this is an aspect that we need to take very seriously. Batik (even more
> > than FOP) needs to attract new contributors. What happens if you and/or
> > Cameron suddenly go inactive? There's no one left. Project dies. (Sorry,
> > my PMC hat shows here)
> 
>     I'm happy to split out independent pieces but in my opinion they
> are either completely separate project (independent releases possible)
> or they should be one code base.  I think the 1/2 way solution is
> just annoying for everyone.

Ok, so let's start from the combined directory structure, see how it
turns out and decide later if it makes sense to split in any way. It
doesn't really fix the "big blob of code" problem but if we can move
forward, then I'm fine with it. At least moving these code parts to the
Commons area is already a good step in the right direction.

> > Is that plausible?
> > 
> > 
> >>    The current system seems like a lot of extra build nonsense
> >>since each jar has to be built in turn so it can be used to
> >>compile the next project, right?
> > 
> > Not necessarily. I don't think a separate build for each package is
> > necessary. I think it's simply helpful to more clearly show to
> > interested people what the individual parts are. 
> 
>     Isn't this better done in the documentation?  I fail to see how
> introducing a bunch of extra empty directories does anything to
> help comprehension.  And as long as you have one big build you
> don't even get enforcement of unidirectional dependencies.

Again, same arguments as above. But you're right documentation is a big
step towards this already. But it's not about the empty directories
which are the problem, it's the number of outer branches (directories)
if the tree. People don't find out what parts are hidden in the tree.
Better have a forest, each tree representing a component/tool. But again,
no problem taking a small step first. Documentation will help.

> > This also favors a cleaner design which is an additional benefit. 
> > I think Victor Mote could tell you a good tale how he alone 
> > already profited from dividing FOray (a FOP fork) into digestable 
> > chunks. 
> 
>     My point is that it doesn't do anything for good design.  There
> is no advantage to this split from a design point of view.

Ok, so we disagree again. Can't be helped. Let's move forward.

So. Do we all agree that I should change the suggested directory tree to
show a combined package structure, not a separate one?

Another thing I can do (if noone objects) while moving code from FOP to
Commons is to remove Commons Logging from certain library parts (PDF lib,
for example) which are not crucial. Maybe even all of it. That should be
possible without too much backfire. I'd kill everybody who'd want to
deprive me of the use of a logging facility in FOP's layout engine,
though. :-) I'd like to keep the dependency on Commons IO. I can offer
to do the repackaging work in the Ant builds.


Jeremias Maerki


---------------------------------------------------------------------
Apache XML Graphics Project URL: http://xmlgraphics.apache.org/
To unsubscribe, e-mail: general-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: general-help@xmlgraphics.apache.org


Mime
View raw message