commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edelson, Justin" <>
Subject RE: [digester] mixed content update
Date Wed, 24 Mar 2004 02:12:50 GMT
I created a subclass of Digester (MixedContentDigester) to do this. Along with a new rule (see
below), it passes my simple test case. Would this be useful code to add? I figured creating
the subclass would make this easier to integrate as it doesn't break anything.
In short, I create a new abstract class called TextRule and a concrete class AddTextRule.
When startElement() on the MixedContentDigester is called, before the rules are invoked, a
search is done for rules matching match + /@text (was it you, Scott, who threw this out on
the 2002 thread? I forgot, but it looked good to me). The AddTextRule's body() method gets
called and the bodyText StringBuffer is emptied. Something simlilar happens on the call to
This is probably not the best explanation, but it does work.
The idea of the abstract TextRule class was that you could have different TextRule, not all
of which did something like "adding." Perhaps it should just be called CallTextMethodRule
(a lot of the rule code is closely related to that in CallMethodRule). I did, however, create
another rule called AddTrimmedTextRule and was thinking about AddNormalizedTextRule (using
JDOM's definition of "normalized")...

	-----Original Message----- 
	From: Scott Sanders [] 
	Sent: Tue 3/23/2004 2:23 PM 
	To: 'Jakarta Commons Developers List' 
	Subject: RE: [digester] mixed content update

	Digester is not set up to handle mixed content.  I would use something else,
	or modify Digester to do what you want.
	Scott (originator of the mixed content thread)
	> -----Original Message-----
	> From: Edelson, Justin []
	> Sent: Tuesday, March 23, 2004 9:18 AM
	> To: Jakarta Commons Developers List
	> Subject: [digester] mixed content update
	> I'm trying to figure out the best way to digest some XML with mixed
	> content, i.e.
	> <a>
	>     <b>
	>         <c>beginning text <d attr="foo"/> ending text</c>
	>     </b>
	> </a>
	> Where it's important for "beginning text" and "ending text" to be
	> treated separately.
	> I looked through the mailing list archives and found a discussion from
	> early 2002 on this subject. It looks like the net result of that
	> discussion was that, in my example above, the content "beginning text
	> ending text" is made available by using a CallMethodRule.
	> Has there been any subsequent discussion? I got the sense that the
	> decision really was that mixed content wasn't "for" Digester in the
	> sense that Digester is targeted to loading configuration files that
	> "tend to be either all-attributes or all-body-content"
	> (
	> I'll happily give up using Digester to accomplished by mixed-content
	> project and switch to JDOM (or even look at the Avalon Configuration
	> stuff someone mentioned), but I wanted to check with the list before
	> giving up.
	> Thanks,
	> Justin
	> ---------------------------------------------------------------------
	> To unsubscribe, e-mail:
	> For additional commands, e-mail:
	To unsubscribe, e-mail:
	For additional commands, e-mail:

View raw message