commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Kitching <>
Subject RE: [digester] mixed content update
Date Thu, 25 Mar 2004 00:01:33 GMT
On Thu, 2004-03-25 at 04:36, Edelson, Justin wrote:
> Since coding these changes, I've rethought whether or not this needs to
> be a subclass - since @text() can't be a legal XML element, existing
> code shouldn't be affected. After starting to duplicate all of the
> existing Digester test cases to run against my new subclass, I realized
> this was more copy and paste then I really wanted to do and modified the
> current from CVS and all of the test cases passed.
> I'd like to go ahead and submit the patches & test cases into bugzilla,
> but would like to get a reading on whether or not my original assumption
> was correct - is a new subclass (MixedContentDigester) more or less
> likely to get submitted to CVS then a patch to Digester - all other
> things (documentation, unit tests, the code itself)?

Why not submit whichever is the least work for you? That way you can get
some feedback without spending too much effort up front. It's not likely
that a significant patch like this will be committed in its first
version anyway [none of mine have...].

If the patch has a significant performance hit on Digester, then it
would need to be either a subclass or require explicit enabling. And if
it is large and complex then I think a subclass would probably be
better. A simple and efficient extension to digester is probably best
integrated directly into the Digester class. But as mentioned above,
it's probably best to just get the discussion started by providing a
patch in either format (whichever is easiest), and get agreement later
on which is best.

I'll definitely have a look at your proposed patch, but at the moment
I'm not wildly keen on the idea. Of course I'm just one contributor -
and I reserve the right to change my opinion if the patch is blindingly
brilliant :-).

My initial feeling is that tacking "@text" on to the end of a pattern is
not elegant. It is neither consistent with the Digester's current style,
nor with xpath.

And I'm not sure there is any great demand for this feature from the
general target user group of digester. Which doesn't mean that such a
feature won't be accepted; however it does mean that it has a much
greater chance if it is a small and elegant solution. Or if it puts in
place a framework that can be built on to add other features to

If your "@text" pattern proposal is consistent with some standard, or
with the way some other app does things, then please mention this as
this would be a big point in favour of the design.

And if you think there are other features that could be added to
digester using "@...."-style patterns then that would also be good to

And if you could provide some use-cases showing what kinds of problems
people can solve with this feature, that would be good.

Please don't think I'm trying to discourage you submitting the patch. 
The suggestions above are intended to increase the chances of a patch
being accepted..

Here's a suggestion for an alternative implementation. I haven't thought
this through deeply, so it may be broken. But it would avoid having
special pattern strings. This suggestion is just intended to stir the
pot of potential solutions :-)

  Define an interface called MixedContentRule or similar which rule
  classes can implement. 

  In Digester's startElement method:
    if bodytext not null:
      for each rule matched by the last call to startElement:
        if rule implements MixedContentRule
          call that rule's content(bodytext) method

The effect should be that any rule which implements the MixedContentRule
interface (and therefore has an extra content(String) method) gets its
content method called whever there is a piece of text followed by a
nested element.



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message