axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jayachandra <>
Subject Re: [Axis2] [Update] XMLConformace Testing Report.
Date Tue, 26 Apr 2005 06:21:44 GMT

Sanjiva, we do have issues with OM as well. 
-->To start with, OM lacks PI, comments and DTD support. On my end, I
added their implementation into OM code base and then ran the test.
-->A default namespace for 'xml' prefix is supposed to be in the scope
of every XML element. I did a work around on my machine as to
declaring this namespace inside the OMElementImpl constructor methods
itself, before running the tests.
-->The 'baseURI' property support is not provided by OM inside
OMElement. If we can keep track of this one thing in OM it can help us
reduce the number of parsed tests that fail at comparison phase by a
good number (a few fifties).

However, getting a 100% success is unlikely without *full* DTD
implementation built into OM. Alek was saying DTD support is not that
well implemented in stAX, it seems, and if that be the need he
suggested to use woodstox.

And Sanjiva, just to be extra cautious that I don't give out wrong
signals :-)... so far I tested OM against *only* valid XMLs of 1.0
version that should be parsed and serialized using any infoset
implementation. We haven't tested OM against how well it can _reject_
invalid and ill-formed XMLs. They actually form the larger fraction of
the XMLsuite about 1800 :-(

Thanks for all your support

On 4/25/05, Sanjiva Weerawarana <> wrote:
> Hi Jaya,
> Wow, thanks for all the hard work on this!
> Do I read your report correctly as this test didn't find any bugs
> in the OM level but rather encountered difficulties in the parser
> level?? If so I'm very happy :-).
> Of the passing ones, what made 735-567 documents not compare
> successfully? Can we fix that?
> Thanks,
> Sanjiva.
> ----- Original Message -----
> From: "jayachandra" <>
> To: <>
> Sent: Monday, April 25, 2005 7:26 PM
> Subject: [Axis2] [Update] XMLConformace Testing Report.
> > Hi all,
> > Total file count in W3C XMLSuite :2634 (this includes, valid, invalidand
> illformed xmls too) Of them, valid ones                    :960 (i.e.
> excluding invalidand illformed xmls. However this includes XMLs of both
> versions 1.0and 1.1)
> > Of them, valid XML1.0 ones         :832 (i.e excluding xmls from
> 1.1version folders. Since the MXParser we have beneath is only 1.0compliant)
> > On this final set, when OM is tested as is. 335 files got parsedproperly,
> and 309 files had the serialized XML matching the input file(comparison
> test). I've implemented OMComment and OMPI and did minimalistic
> OMDTD(without validation etc.) support. And with those changes the
> parsingrate increased to 735 and comparison success reached 567.
> > The parsing failures found can be attributed to one or more of
> thefollowing observations I could make. This is not an exhaustive
> listthough.
> > 1. For files where XML declaration line has a mention of
> 'standalone'attribute prior to 'encoding' attribute, underlying MXParser
> threw anexception with a message reading something like "Expected 'e'
> inencoding and not 's' ". Alek! Is this a known issue with STAX. What doyou
> think?
> > 2. For files in which DTD declaration has right square bracket (']')as a
> literal value of some entity, MXParser is treating it as end ofDTD
> declaration.
> > 3. Some xmls having multi byte characters (UK currency pound signamongst
> others) are failing to get parsed with typical exceptionmessages like only
> whitespace content allowed before start tag and not\ufffd. I have passed a
> "UTF-8" aware reader to the builder, do I needto use something else here?
> > 4. Apart from these because I couldn't implement the complete DTD infoset
> implementation, some more files are failing to get parsed.
> > Regarding the comparison, some of the observed reasons of failures are…
> > 1. Many SYSTEM identifiers in DTD declarations used a relativereference
> and so far we don't have considered 'baseURI' property (doesSTAX parser
> provide one?) for any of the elements and hence the XMLcomparator (xmlunit)
> couldn't resolve the system identifiers therebyleading to a mismatch between
> the serialized xml and the originalinput form.2. Also since the DTD support
> is naïve, the presentation of data iscompletely ignored thereby leading to
> scenarios like, serializing as#PCDATA when DTD says CDATA. This also lead to
> significant comparisonfailures.
> > ThanksJaya
> > ---- Jaya
> >

-- Jaya
View raw message