commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Incze Lajos <in...@mail.matav.hu>
Subject Re: cvs commit: jakarta-commons/digester/src/java/org/apache/commons/digester Digester.java
Date Mon, 14 Jan 2002 20:58:03 GMT
On Mon, Jan 14, 2002 at 06:38:08PM +0000, robert burrell donkin wrote:
> 
> On Monday, January 14, 2002, at 12:41 AM, Incze Lajos wrote:
> 
> > On Sun, Jan 13, 2002 at 08:07:34PM +0000, robert burrell donkin wrote:
> >> a bit interesting, this one. i think that processing of whitespace is
> >> parser dependent...
> >
> > I think, no. White spaces are significant(*) in xml, so the parser can't
> > drop it. On the other hand whith SAX2 you can have (but not listed amongst
> > the sax2 core features) ignorable whites space filter, so a sax2
> > filter can trim out the leading and ending white spaces for you and
> > maybe compact the other spaces to one space. If this filter would be
> > standrad then you could rely on it. However the whole status of SAX2
> > whether it's a standard or not, is a good question.
> >
> > (*) unless you use the xml:space attribute)
> 
> hi incze
> 
> i've had a chance to read the SAX specs again. what you're saying is (i 
> think) correct but not quite what i was thinking of. parsers can return 
> whitespace either through the characters method or through the 
> ignorableWhitespace method. digester only keeps characters received 
> through the characters() method. so there might be a small difference in 
> digester's behaviour with different parser - but probably not really 
> enough to worry about.

I've reread the xml 1.0 spec, too. And it is just the other way as I
remembered. The 'xml:spec' value can be "default" or "preserve". The
"default" (which is the default, he-he) means that the xml author accepts
the application's whitespace policy. You have to specify the "preserve"
attribute for the element subtree if you want to be sure that the
app doesn't drop your whitespaces. (On the other hand: the automatic
trim() in digester is simply a bug in this respect.)

incze

--
To unsubscribe, e-mail:   <mailto:commons-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:commons-dev-help@jakarta.apache.org>


Mime
View raw message