commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert burrell donkin <>
Subject Re: Digester trimming whitespaces
Date Sun, 03 Oct 2004 22:33:51 GMT

On 3 Oct 2004, at 22:12, Simon Kitching wrote:

> On Fri, 2004-10-01 at 20:40, Marco Mistroni wrote:
>> Hello all,
>> 	I am currently having a problem (?) with digester in the
>> Sense that in parsing XML is 'trimming' whitespaces..
> Hi Marco,
> Yes, some Digester rules do this deliberately. If you look at the 
> source
> for CallParamRule, CallMethodRule, etc. you will see something like:
>   bodyText = bodyText.trim();
> As this code precedes my involvement in Digester, I can't say exactly
> what the motivation was for doing this, but presume there was a good
> reason.

craig or scott would be needed to give a definitive answer to this one.

my guess is that since the handling of whitespace by parsers has been 
variable, the best way to achieve consistency is to lose all 

> I certainly have been using digester fairly heavily and not
> needed to allow leading/trailing whitespace in element bodies. However 
> I
> can understand that some people might need to.
> I would recommend that you take a copy of the source of whatever rule 
> is
> causing you problems and rename the class (including changing the
> package declaration to something in your namespace), then delete the
> trim() call.

i'm not sure whether this would do it.

i suspect that what would be needed would be for an additional flag to 
be added to digester that would pass on all calls to 
ignorableWhitespace to characters. depending on the parser used, some 
configuration may be necessary to ensure that the whitespace is passed 
on to digester.

in terms of the rules, it would probably be neater and quicker to move 
the trim call out of the rule (where it may be called multiple times) 
and into digester. when digester was set to ignore whitespace, 
whitespace would be trimmed. when the setting was to record whitespace, 
ignorableWhitespace would pass the whitespace on to characters and the 
output wouldn't be trimmed.

should be quite an easy change to make but ensuring that recording 
whitespace worked might prove more tricky...

- robert

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message