commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Kitching <>
Subject Re: [digester] initial code for Digester2.0
Date Thu, 03 Feb 2005 02:38:30 GMT
On Thu, 2005-02-03 at 02:11 +0100, Oliver Zeigermann wrote:
> On Thu, 03 Feb 2005 11:39:01 +1300, Simon Kitching <> wrote:
> > > I was also wondering, there may be occasions where it is desirable to
> > > have the full body *including tags*  passed in a call back. This would
> > > mostly apply in mixed context tags where text is mixed with style
> > > information that do not need processing like with XTHML.
> > 
> > You mean stringify the child elements too, like XSLT does if you ask for
> > the text of a mixed-content element?
> Yes.
> > I suppose we could do this, though I am not entirely sure how much use
> > this would be. Can you think of a use-case?
> Think of the transformation of our web pages. There is structure
> information wrapping pure XHTML. You would not want a callback for all
> formatting tags, would you? Maybe this is not a very common use of
> Digester, though...

Ok, I see. It would be reasonably simple to implement; we already
calculate the full text for each element (so we can pass it to the body
methods) in the SAXHandler class; we just need to keep appending these
instead of discarding them when the element ends.

One issue, I guess, is that by the end of the document we have a
StringBuffer that contains the entire text for the entire document -
which might take up a bit of memory. So maybe we need some mechanism for
an Action to tell the SAXHandler [from its begin() method, via a mixin
interface, or otherwise] that it wants a full text tree. The SAXHandler
can then start accumulating.

If you wished to contribute such a patch, I think I'd be in favour of

> > If you mean pass a DOM tree into the Action to represent the "full body"
> > content, I think not :-).
> Certainly not. I think there is no place for the DOM in Digester.

Phew! :-)

> > > > Or are you by chance referring to my suggestions for xml-rules?
> > >
> > > No, what are they?
> > 
> > I was puzzled about your reference to "reflection" in the previous
> > email, as accessing Rule (now Action) classes is never done via
> > reflection. However in the RELEASE-NOTES.txt I do discuss possible
> > updates to the classes in the xmlrules package to use reflection to make
> > Action classes accessable via the xmlrules mapping file rather than have
> > the xmlrules java code contain an explicit mapping class for each Action
> > as is currently done.
> Is that so? I have no internal knowlede of beanutils, but I thought
> there is no other way of calling a parameterized method than by
> refelection methods. But I am happy to learn something here :)

Just some minor misunderstanding I think..

The digester framework invokes Rule (Action) classes directly. There is
no reflection involved in the invocation of Rule (Action) classes.

I am proposing that xmlrules actually uses reflection to generate a set
of Action objects when parsing its rule configuration input file. Of
course the parsing of the actual user input would then be done in the
normal manner (with the digester framework calling the Actions

The Rule (Action) classes interact with domain-specific (user) classes
via BeanUtils and reflection. I don't see any alternative, except for
the "pre-processor" type xml mapping tools, or runtime bytecode
generation, neither of which are really Digester's domain.

> > 
> > I remember the main issue being that Digester is built around the
> > concept of having patterns control what operations were executed for
> > each xml element, and having the invoked logic partitioned into many
> > small Rule classes.
> > 
> > You wished the user to write a big switch statement in Java to determine
> > what operations were executed, as you felt that this was more natural to
> > people used to writing SAX code by hand.
> > 
> > We did briefly discuss ways of layering the code so that these were two
> > possible options the user could choose between, but I couldn't see then
> > how this would be possible.
> Thanks for reminding me of my reservations :) Now I remember!
> Especially when writing rahter simply import code I think it is much
> easier and obvious to have all the code at one position instead of
> having it distributed into many classes. However, this seems to be
> rather simple to accomplish. You just register a single action to be
> matched for all elements and then access the context to tell you the
> path of the current element. Maybe having a conveniece method to match
> paths to the current element directly.
> Wouldn't this work?

Hmm.. If we had a class that implements RuleManager that always returns
a custom Action no matter what the path, then all events would be
forwarded to the user-provided action, where the user can call
to access the current path, and determine from there what operations to

// xmlio-style digester
Action myHandler = new AbstractAction() {
  public void begin(
   Context context, 
   String namespace, String name,
   Attributes attrs) {
    String path = context.getMatchPath();
    if (path.equals("......")) {
    } else {

  public void body(...) {

RuleManager xmlioRuleManager = new XMLIORuleManager(myHandler);
Digester d  = new Digester();


> Speed is another issue with xmlio, as it is really fast. But with some
> optimizations geared towards this, digester shoudn't relly be much
> slower anyway...


> > If you can think of some way of merging these quite different
> > approaches, I'm very keen to hear it. Or if you feel more kindly toward
> > a "distributed" pattern-matching + Action class approach, then that
> > would resolve the major issue and we can look at how the other xmlio
> > features could be provided in Digester (well, we can do that anyway!).
> Are you thinking of the export features?

No, just wondering in general if there is stuff that can be merged.
I've not thought too much about obj->xml, and anyway Betwixt has that
reasonably well covered as far as I know.

> Thinking of the import features, having more than one actions being
> invoked on a certain element would be essantial. Just think of some
> sorf of logging or debugging action that is triggered with every
> element next to the normal processing. Does this currently work with
> digester 2?

Having multiple Rule (or Action) instances triggered for an element in
the input has always been supported, and definitely will be present in
digester2; it's critical.

If you mean having some debug Action that is triggered *for every
element seen* in addition to the ones whose patterns actually match,
then that can be done fairly easily by subclassing a Rules (in
digester1.x) or RuleManager (in digester2.x) class. I guess we could
build it in to the default class though...

Thanks by the way for all your comments. It's great to know other people
are interested in a digester2...



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message