axis-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Sosnoski <>
Subject Re: Why Pull-Parser faster ?
Date Thu, 20 Feb 2003 20:53:39 GMT
Bill de hÓra wrote:

> Dennis Sosnoski wrote:
>> AFAIK XPP has always reported whitespace properly. Glue's Electric 
>> XML is the only parser I know of that discards whitespace between 
>> elements (though this may have become an option rather than hardwired 
>> behavior in recent versions).
> Fair enough if a feature is provided that allows ignorable ws to be 
> just that. But that doesn't answer my question. 

Electric XML did not distinguish between ignorable whitespace (as 
defined in the XML specification, which requires validation) and 
ordinary whitespace separating elements - it just discarded it all. As I 
remember the only whitespace it reported was inside elements with only 
character data content.

>> XPP3 is the current version of the XPP parser. XPP3 implements the 
>> XMLPull interface ( and is compliant with the 
>> XML specification except with respect to DTDs and related issues 
>> (general entities, etc.). These generally aren't used for 
>> data-oriented XML, and I think they're actually forbidden by SOAP.
> XPP hs no business calling itself an XML anything if it doesn't square 
> with XML 1.0. If it wants to be a SOAP processor, it should be renamed. 

Your argument is perhaps better directed at SOAP itself. Many XML 
advocates feel that SOAP is not XML and should stop representing itself 
as such. To quote from the 1.1 spec: "A SOAP message MUST NOT contain a 
Document Type Declaration.  A SOAP message MUST NOT contain Processing 
Instructions." This means that SOAP uses a subset of XML, and XML 
subsets are not recognized.

If you want full XML support with an XMLPull parser you've got the 
option of using the implementation based on Xerces XNI (which I think 
includes full validation support), or could alternatively write a 
wrapper for the basic XMLPull interface that handles DTD processing. 
I've thought about doing that in the past, but it hasn't been high on my 
priority list.

IMHO one of the mistakes of the SAX interface design was to merge DTD 
handling and validation into the parser core. These types of functions 
can more cleanly be handled as layers over a core parser API. If this 
had been done with SAX we would have the problem of some parsers 
supporting validation and other not - the validation would be a SAX 
filter that could be used with any parser.

  - Dennis

View raw message