xerces-j-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Klaus Malorny <Klaus.Malo...@knipp.de>
Subject Re: "normalized-value" feature of Xerces
Date Thu, 14 Jul 2005 08:48:37 GMT
Klaus Malorny wrote:
> Hi,
> If this is a FAQ, please excuse and point me to the right location. I 
> did my homework, but did not find any suitable information.
> I have the following problem: I would like to parse and validate an XML 
> document and access this XML document via DOM afterwards. As the related 
> schema makes extensive use of the "normalizedString" and "token" 
> datatypes (i.e. those with the whiteSpace facet with "replace" and 
> "collapse" values), I would like to access the whitespace normalized 
> values rather than the actual values contained in the original XML 
> document to avoid a manual normalization at every location in my code.
> I saw that Xerces (I tried the latest version 2.7.0) supports a feature 
> called "http://.../normalized-value". However, I see no difference when 
> I set this to "true". The DOM nodes still contain the unnormalized 
> values. I also set the other required features as documented. I verified 
> that validation is actually performed, i.e. parsing an invalid document 
> does result in an exception.
> I use the following way to create the parser (javax.xml.* classes under 
> JDK 1.5):
> [...]
> Any ideas, comments on what I am doing wrong? Or do I misunderstand this 
> feature?
> Thanks in advance for any feedback.
> regards,
> Klaus


by creating a debug version of Xerces and debugging my code along with Xerces, I 
accidentially discovered the source of my problem: To create a "Schema" object, 
I used the following code (using the javax.xml.validation package):

     SchemaFactory factory =
       SchemaFactory.newInstance (XMLConstants.W3C_XML_SCHEMA_NS_URI);

     Source[] sources = ...

     Schema schema = factory.newSchema (sources);

Unfortunately, this does not create a Schema instance that uses Xerces 2.7.0 
code, instead, it creates a Schema instance of the Xerces that comes with J2SE 
5, which is obviously incapable of the desired normalization feature. Xerces 
2.7.0 seems to detect that this class is not his own class and inserts the J2SE 
validator into its pipeline (with XNI <-> SAX adapters).

If I create the factory directly, i.e.

     SchemaFactory factory =
       new org.apache.xerces.jaxp.validation.XMLSchemaFactory ();

everything works as expected. My big question now is why do I not get a suitable 
factory from Xerces 2.7.0, while the similar JAXP parser factory is actually 
from 2.7.0? Is this a bug? How do I manage to get the 2.7.0 implementation with 
SchemaFactory.newInstance ()?

Thanks in advance for any hints.



To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org

View raw message