commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Craig McClanahan <>
Subject Re: Digester trimming whitespaces
Date Mon, 04 Oct 2004 04:59:38 GMT
On Mon, 04 Oct 2004 10:12:36 +1300, Simon Kitching
<> wrote:
> On Fri, 2004-10-01 at 20:40, Marco Mistroni wrote:
> > Hello all,
> >       I am currently having a problem (?) with digester in the
> > Sense that in parsing XML is 'trimming' whitespaces..
> Hi Marco,
> Yes, some Digester rules do this deliberately. If you look at the source
> for CallParamRule, CallMethodRule, etc. you will see something like:
>   bodyText = bodyText.trim();
> As this code precedes my involvement in Digester, I can't say exactly
> what the motivation was for doing this, but presume there was a good
> reason. I certainly have been using digester fairly heavily and not
> needed to allow leading/trailing whitespace in element bodies. However I
> can understand that some people might need to.

Here's a real simple use case ... parsing web.xml files in Tomcat. 
Regardless of the technical niceties of how XML parsers actually work,
users expect something like:



  <servlet-class>   com.mypackage.MyServlet   </servlet-class>



to have the same semantic effect.  That is accomplished by trimming
whitespace off the body content before using its contents. 
Consistency (the "principle of least surprise") will then encourage us
to do the same thing anywhere else the body content is processed, so
we did.

> I would recommend that you take a copy of the source of whatever rule is
> causing you problems and rename the class (including changing the
> package declaration to something in your namespace), then delete the
> trim() call.
> If you feel like contributing a patch to add some kind of boolean flag
> to the original Rule class to allow people to enable/disable trimming of
> whitespace (including unit tests) then I think there would be some
> interest in applying this to Digester - I certainly think that would be
> useful.

That would indeed be useful for some scenarios for using Digester
other than parsing configuration files.  However, even there I suspect
it's going to be quite common for the producer of XML content to be
able to add newlines and indentations (for readability) of body
content, so the need to avoid trimming is certainly not going to be

> Regards,
> Simon


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message