commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edelson, Justin" <>
Subject RE: [digester] mixed content update
Date Wed, 24 Mar 2004 16:36:42 GMT
I looked at the NodeCreateRule approach, but this removes much of what I
like about Digester - no DOM code, very declarative, and ease of rule
reuse (especially with wildcard patterns).

Since coding these changes, I've rethought whether or not this needs to
be a subclass - since @text() can't be a legal XML element, existing
code shouldn't be affected. After starting to duplicate all of the
existing Digester test cases to run against my new subclass, I realized
this was more copy and paste then I really wanted to do and modified the
current from CVS and all of the test cases passed.

I'd like to go ahead and submit the patches & test cases into bugzilla,
but would like to get a reading on whether or not my original assumption
was correct - is a new subclass (MixedContentDigester) more or less
likely to get submitted to CVS then a patch to Digester - all other
things (documentation, unit tests, the code itself)?


-----Original Message-----
From: Craig R. McClanahan [] 
Sent: Wednesday, March 24, 2004 11:00 AM
To: Jakarta Commons Developers List; Edelson, Justin
Cc: Jakarta Commons Developers List
Subject: RE: [digester] mixed content update

Quoting "Edelson, Justin" <>:

> I created a subclass of Digester (MixedContentDigester) to do this. 
> Along with a new rule (see below), it passes my simple test case. 
> Would this be useful code to add? I figured creating the subclass 
> would make this easier to integrate as it doesn't break anything.

In addition to this approach, take a look at NodeCreateRule (in version
1.4 or later), contributed by Christopher Lenz.  It is designed to
swallow mixed content and give you back a DOM of the body content of the
matched element (which still has to be well formed XML).

Craig McClanahan

> In short, I create a new abstract class called TextRule and a concrete

> class AddTextRule. When startElement() on the MixedContentDigester is 
> called, before the rules are invoked, a search is done for rules 
> matching match + /@text (was it you, Scott, who threw this out on the 
> 2002 thread? I forgot, but it looked good to me). The AddTextRule's 
> body() method gets called and the bodyText StringBuffer is emptied. 
> Something simlilar happens on the call to endElement().
> This is probably not the best explanation, but it does work.
> The idea of the abstract TextRule class was that you could have 
> different TextRule, not all of which did something like "adding." 
> Perhaps it should just be called CallTextMethodRule (a lot of the rule

> code is closely related to that in CallMethodRule). I did, however, 
> create another rule called AddTrimmedTextRule and was thinking about 
> AddNormalizedTextRule (using JDOM's definition of "normalized")...
> Justin
> 	-----Original Message----- 
> 	From: Scott Sanders [] 
> 	Sent: Tue 3/23/2004 2:23 PM 
> 	To: 'Jakarta Commons Developers List' 
> 	Cc: 
> 	Subject: RE: [digester] mixed content update
> 	Justin,
> 	Digester is not set up to handle mixed content.  I would use 
> something else,
> 	or modify Digester to do what you want.
> 	Scott (originator of the mixed content thread)
> 	> -----Original Message-----
> 	> From: Edelson, Justin []
> 	> Sent: Tuesday, March 23, 2004 9:18 AM
> 	> To: Jakarta Commons Developers List
> 	> Subject: [digester] mixed content update
> 	>
> 	> I'm trying to figure out the best way to digest some XML with
> 	> content, i.e.
> 	>
> 	> <a>
> 	>     <b>
> 	>         <c>beginning text <d attr="foo"/> ending text</c>
> 	>     </b>
> 	> </a>
> 	>
> 	> Where it's important for "beginning text" and "ending text" to
> 	> treated separately.
> 	>
> 	> I looked through the mailing list archives and found a
discussion from
> 	> early 2002 on this subject. It looks like the net result of
> 	> discussion was that, in my example above, the content
"beginning text
> 	> ending text" is made available by using a CallMethodRule.
> 	>
> 	> Has there been any subsequent discussion? I got the sense that
> 	> decision really was that mixed content wasn't "for" Digester
in the
> 	> sense that Digester is targeted to loading configuration files
> 	> "tend to be either all-attributes or all-body-content"
> 	>
> 	>
> 	>
> 	> I'll happily give up using Digester to accomplished by
> 	> project and switch to JDOM (or even look at the Avalon
> 	> stuff someone mentioned), but I wanted to check with the list
> 	> giving up.
> 	>
> 	> Thanks,
> 	> Justin
> 	>
> 	>
> 	> To unsubscribe, e-mail:
> 	> For additional commands, e-mail: 
> 	To unsubscribe, e-mail:
> 	For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message