xerces-c-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Wright (JIRA)" <xerces-c-...@xml.apache.org>
Subject [jira] Updated: (XERCESC-1828) LexicalHandler startEntity/endEntity events not paired and have incorrect arguments
Date Wed, 13 Aug 2008 15:08:44 GMT

     [ https://issues.apache.org/jira/browse/XERCESC-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Erik Wright updated XERCESC-1828:
---------------------------------

    Attachment: Test.java

Here is the Java file used to generate the Java comparison output.

> LexicalHandler startEntity/endEntity events not paired and have incorrect arguments
> -----------------------------------------------------------------------------------
>
>                 Key: XERCESC-1828
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1828
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: SAX/SAX2
>    Affects Versions: 2.8.0
>         Environment: OS/X, Win32
>            Reporter: Erik Wright
>         Attachments: java.output, SAX2EventsSample.tgz, Test.java, test.output, test.xml
>
>
> It appears that the LexicalHandler events startEntity and endEntity are not sent correctly
when parsing a document with a DTD that itself references external entities.
> (Note: I will attach sample XML, repro code, and the full output of the code. The following
is a summary.)
> For example, I have been parsing a valid XHTML document. The strict XHTML DTD includes
4 other files with entity declarations. I see the following events on my LexicalHandler (ignoring
elements, characters, whitespace, external entity declarations, and comments):
> startDocument
> ...
> startDTD: html, -//W3C//DTD XHTML 1.0 Strict//EN, http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> endEntity: [dtd]
> ...
> endDTD
> ...
> endDocument
> I expected something more like the following (as generated by the standard SAX parser
in Java 6):
> startDocument
> startDTD: 'html', '-//W3C//DTD XHTML 1.0 Strict//EN', 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
> startEntity: '[dtd]'
> startEntity: '%HTMLlat1'
> endEntity: '%HTMLlat1'
> startEntity: '%HTMLsymbol'
> endEntity: '%HTMLsymbol'
> startEntity: '%HTMLspecial'
> endEntity: '%HTMLspecial'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%inline'
> endEntity: '%inline'
> startEntity: '%misc'
> endEntity: '%misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%misc'
> endEntity: '%misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%inline'
> endEntity: '%inline'
> startEntity: '%misc'
> endEntity: '%misc'
> endEntity: '[dtd]'
> endDTD
> startPrefixMapping: '', 'http://www.w3.org/1999/xhtml'
> endPrefixMapping: ''
> endDocument
> At a minimum, the mismatch of startEntity/endEntity events appears to be caused by the
following code from DTDScanner::scanExtSubsetDecl (notice that the conditions are not the
same):
>    if (fDocTypeHandler && !inIncludeSect)
>        fDocTypeHandler->startExtSubset();
>    ...
>    ...
>    ...
>    if (fDocTypeHandler && isDTD)
>        fDocTypeHandler->endExtSubset();

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Mime
View raw message