xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HERRICK, CHUCK (SBCSI)" <CH6...@momail.sbc.com>
Subject RE: Xerces 1.0.2 DOMParser, whitespace and #text elements
Date Wed, 08 Mar 2000 23:35:49 GMT
It's the XML key word ANY

If the DTD says
--
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT Outermost (Inner) >
<!ELEMENT Inner (#PCDATA)>
--
The DOMParser with setIncludeIgnorableWhitespace(false)
ignores whitespace.

If the DTD says
--
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT Outermost ANY >
<!ELEMENT Inner (#PCDATA)>
--
The DOMParser with setIncludeIgnorableWhitespace(false)
does NOT ignore whitespace, and it creates
#text whitespace elements.

Perhaps my understanding is in the weeds here?

-----Original Message-----
From: HERRICK, CHUCK (SBCSI) 
Sent: Wednesday, March 08, 2000 5:11 PM
To: general@xml.apache.org
Subject: RE: Xerces 1.0.2 DOMParser, whitespace and #text elements


Well, this is weird.
I have two XML documents. 
Both have DTDs and are
valid.

the Xerces DOMParser builds
a Document from one without
#text whitespace elements
and builds a Document from 
the other _WITH_ #text
whitespace elements.

Same Java code.

-----Original Message-----
From: Calvin Gaisford [mailto:calvin@calderasystems.com]
Sent: Wednesday, March 08, 2000 4:30 PM
To: general@xml.apache.org
Subject: Re: Xerces 1.0.2 DOMParser, whitespace and #text elements


I have seen the same thing but I just found this in the docs:

This method is used to report all the whitespace characters,
which are determined to be 'ignorable'. This distinction
between characters is only made, if validation is enabled.

If I understand that correctly, ignoring whitespace only
works if you have validation turned on.  right?



"HERRICK, CHUCK (SBCSI)" wrote:

> Xerces-J 1.0.2 on NT in VisualAge for Java
>
> If you send setIncludeIgnorableWhitespace(false) to
> DOMParser, and then parse XML that has basically
> ELEMENT_NODEs, and a bit of whitespace between
> the start tags and end tags of the ELEMENT_NODEs,
> you get #text nodes that contain the white space
> (new line, tab, spaces, etc).
>
> What's up with that?
>

Mime
View raw message