xerces-j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajiv Shivane <mur...@yahoo.com>
Subject Re: Error recovery while parsing XMLs ..
Date Tue, 08 Apr 2003 11:30:55 GMT
Hi Sandy,

I completely agree that typically during error
recovery there is more than one ``Correction'' that
can be applied to the stream so that the stream is
parseable. Any parser can atmost take a best guess at
which correction is to be applied to recover from the

To make the best guess is it possible to make the
parser look at the DTD and eliminate some of the
corrections? In the example I gave :

<icon >

There are more than one Corrections with which the
parser can recover. But the element declarations are:

<!ELEMENT icon (small-icon?, large-icon?)>
<!ELEMENT small-icon (#PCDATA)>
<!ELEMENT large-icon (#PCDATA)>

So the best guess in this case should have been to add
</small-icon> before <large-icon>

Do you think it is possible to improve the recovery by
making the parser consider the content model during
the error recovery phase? Could you give me some hints
as to how I can go about doing this?


--- Sandy Gao <sandygao@ca.ibm.com> wrote:
> But how would the parser know the input is not
>   <icon >
>    <small-icon>stillTypingMygif
>    <large-icon>/tmp/large.gif</large-icon>
>    </small-icon>
>   </icon>
> in which case adding </small-icon> before
> <large-icon> is worse.
> In dealing with errors (especially well-formedness
> ones), the best the
> parser can do is to make a guess, which can't be
> guaranteed to be the best
> in all cases.
> Thanks,
> Sandy Gao
> Software Developer, IBM Canada
> (1-905) 413-3255
> sandygao@ca.ibm.com

Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more

To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org

View raw message