commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oliver Zeigermann <oliver.zeigerm...@gmail.com>
Subject Re: [xmlio] comparison with Digester
Date Mon, 11 Oct 2004 06:12:23 GMT
Hi Simon,

I see you have put some energy on feedback! Thanks for that :)

On Sun, 10 Oct 2004 23:03:56 +1300, Simon Kitching
<simon@ecnetwork.co.nz> wrote:
> Hi,
> 
> I've had a look at the new "xmlio" code in the sandbox; below is my
> initial opinion. Note that I am a committer on Digester, and am
> therefore totally biased....

Note that I created xmlio and am therefore not less biased ;) 

> The "in" code that is used to parse input xml documents is really really
> similar in concept to what Digester already does.
> 
> The main features appear to be:
> (a)
> A "callback handler list" (instead of a single ContentHandler object).
> 
> Well, this seems to me to be equivalent to the Digester concept of
> having multiple Rule objects match a particular input element, ie
> several distinct sections of code can all be triggered when a particular
> element is matched.

My understand: xmlio just goes with the callback, Digester creates
objects. This is a difference in interface as well as in performace,
right?

> (b)
> A complete path to the current element is passed to the "startElement"
> method.
> 
> Digester has the "getMatch" method which can be called by any rule to
> get the path to the current element. Xmlio does provide a SimplePath
> instance instead of a plain string to represent this path (equivalent to
> the File class wrapping a filename). However in Digester you don't
> really need anything more complex than a string because you don't
> normally do computations on paths anyway - you leave that up to the
> "rule matcher" class.

And a hierarchy of objects representing the XML structure, right?

> (c)
> The xmlio concept of having a callback method invoked at element end
> which passes both the element text and the element attributes is mildly
> useful (but calling this method "startElement" is rather confusing IMO).
> It would certainly be possible to add this feature to Digester/Digester2
> (though it does have a minor performance drawback). With the current
> digester code, you can clone the attrs and push them on a (named) stack
> in begin() and then fetch them back in body() to get the same effect.

(1) Why do you think it is mildly useful only? My experience is stuff
similar to this occurs all the time

<parameter name="olli">xmlio</parameter> 

which you then get with a single callback. Besides calling such a
method startElement might indeed be misleading. Better ideas?

Anyway, the above does not work in mixed content only, i.e. tags mixed
with text which usually is the case with flow text only. Flow text
then hardly needs detailed and special treatment by xmlio or Digster
then. Do you have other examples where mixed content occurs and would
need a detailed treatment?

(2) xmlio was build for simplicity and transparent use. No funky
details in the background, no surprises, all obvious. I am more than
convinced all this can be done with Digester as well, but maybe not
this simple and obvious and easy to learn and do. E.g. you will rarely
need to maintain any additional stacks in xmlio, at least not for
that.

> My initial feeling, therefore, is that I would much rather see
> additional work put into the Digester project than having this new xmlio
> project essentially recreate a subset of Digester functionality.
> 
> The main problem with Digester, I think, is that it has nasty
> inter-class dependencies that prevent subsets of the classes from being
> distributed. Every Rule class depends upon the Digester class, which
> provides "parse context". But the Digester class has factory methods for
> all the rules - so Digester can't be distributed without including *all*
> the Rule classes. Breaking this dependency is number one priority for
> Digester2 as far as I am concerned. It isn't that hard to do; I've been
> experimenting with various refactorings already.
> 
> I would love to see several jar files built from the Digester2 source: a
> "src" distro, a "full" jar, and a "basic" jar. The basic jar would have
> about 8 classes, being about 4 classes of core functionality and 4 basic
> Rule classes. In this form I think the "basic" jar could be entirely
> appropriate for use by projects such as an i18n library without
> resorting to creating a new project. Xmlio by itself doesn't provide any
> functionality to actually instantiate objects or set properties; you
> need to write one or more subclasses of SimpleImportHandler (similar to
> ContentHandler), so by the time that is done I think that code based on
> Digester and xmlio would be pretty similar in size.
> 
> There is one other significant issue: required libraries. Digester
> depends upon BeanUtils, mainly because it performs dynamic conversion
> between strings and other datatypes such as int, bool, etc. For a
> light-weight parsing library this could be a nuisance; I expect we could
> find a way of making automatic datatype conversion (and therefore
> BeanUtils functionality) optional though. Digester also depends upon
> commons-logging. I did make a proposal a while ago to make logging
> dependencies in Digester optional; the patch wasn't received with any
> great enthusiasm at the time, but with people actually pushing for this
> it might make it into Digester2.
> 
> Regarding the "out" part of the xmlio libs: this is basically a
> collection of static functions doing simple but useful xml string
> encoding etc., and a stream class that does auto-indenting. Digester
> certainly doesn't have anything like this. This code does feel like it
> might be at home in "lang" or "codec"...

Plus pushing XML into byte streams. Besides there are quite some
pieces of code lying around in Jakarta doing similar stuff. Maybe we
could take of the as well...

> Oliver, if there was a "digester2" project which provided a "basic" jar
> that was pretty light-weight and had only optional dependencies on
> commons-beanutils and on commons-logging, might you consider using that
> in i18n (or even Slide) instead of the xmlio code? (And would you be
> interested in helping to create digester2??).

Can't speak for i18n, but if what you have then is fine, why not using it...

> I'm finally going to be free of the horror that is my current job in
> December, and plan to spend a fair chunk of January getting a Digester2
> up and running (assuming that Craig/Robert et al are happy with that).
> Even if xmlio goes ahead, and the i18n component uses it I will still
> keep its features in mind when working on Digester2.

Good luck,

Oliver

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message