xalan-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (XALANJ-2493) BasisLibrary.nodeList2Iterator broken
Date Thu, 14 Aug 2014 11:34:12 GMT

    [ https://issues.apache.org/jira/browse/XALANJ-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096876#comment-14096876
] 

Uwe Schindler commented on XALANJ-2493:
---------------------------------------

Hi,
after upgrading to XALAN 2.7.2, I got very strange errors in one of my XSL transformations,
but just for some documents. It took a while to find the problem:
I have an extension function that returns a NodeList with some elements and text nodes (some
HTML). Unfortunately this NodeList contained an *empty* text node as very last element. This
is perfectly fine for DOM, because there is nothing that forbids empty text nodes. This empty
node caused the following error when invoking <xsl:copy-of select=java:Class:myExtensionFunction()"/>:

{{InternalRuntimeError: Expected element missing at 6}}

(6 is the last node in the Node-Set, the empty text node)

I fixed the issue by calling normalize() before returning the NodeList as quick hack. Later
I did a better fix to not create the empty nodes at all.

In XALAN 2.7.1 this worked without any problem.

Should I open a new issue for that?

> BasisLibrary.nodeList2Iterator broken
> -------------------------------------
>
>                 Key: XALANJ-2493
>                 URL: https://issues.apache.org/jira/browse/XALANJ-2493
>             Project: XalanJ2
>          Issue Type: Bug
>      Security Level: No security risk; visible to anyone(Ordinary problems in Xalan projects.
 Anybody can view the issue.) 
>          Components: XSLTC
>    Affects Versions: 2.7.1
>            Reporter: Martin von Gagern
>             Fix For: The Latest Development Code, 2.7.2
>
>         Attachments: XALANJ-2493_1.patch, XALANJ-2493_2.patch, XALANJ_2493_Test1.java
>
>
> The current implementation of nodeList2Iterator is broken, because it can not deal with
attribute nodes. It relies on copyNodes which in turn tries to add attribute nodes as children
of some top level node. Attributes don't live on the children axis, though, so this is against
DOM and causes a DOM exception in the Xerces DOM implementation and probably most other implementations.
The resulting HIERARCHY_REQUEST_ERR was noted e.g. in XALANJ-2424.
> Furthermore, the implementation is inefficient, because it manually copies each and every
node from the source document to a new DTM to some new DTM. The time overhead for the copying
as well as the memory overhead for the additional DOM can be avoided in cases where the nodes
come from some input document, as opposed to a document completely loaded within some extension
function.
> A comment in the related XALANJ-2425 suggests returning DTMIterator from extension functions
and avoiding the re-import for those. I don't like this idea because it exposes a lot of Xalans
internals to extension functions, and because the returned node list might be newly created,
while at least some of thenodes might still be from the same document. So instead of special
cases for the list type, I implemented special cases for every node of the list. If it is
a proxy node of the same (Multi)DOM, we simply use its handle.
> If not, we add it to some w3c DOM and turn that into a DTM, pretty much like the current
implementation does. However, I dropped copyNodes in favor of Document.importNode, to avoid
code duplication and rely on supposedly more heavily tested code. I also added another level
of elements, so that there is one such dummy node for every item of the source list, with
always a single child or element. A few assertions help ensure this single child policy. This
is especially important in the new implementation, because otherwise it would become difficult
to get the proxied nodes and the newly DTMified nodes into correct order.
> Right now, the import of DOM nodes is only implemented for those nodes I expect to turn
up in the DTM in pretty much the same form as they do in the w3c DOM. For all other nodes,
an internal error is thrown. This especially concerns document fragment nodes. At least in
w3c DOM, a document fragment can never be a child, so if DTM behaves the same, we would need
to import document fragments seperately, or expand them to the list of their children instead.
I'm not sure what correct behaviour would be here, so I'd rather throw an exception than implement
wrong behaviour.
> I also noticed that org.apache.xml.dtm.ref.DTMManagerDefault.getDTMHandleFromNode would
in theory provide an implementation for turning w3c nodes into DTM handles - exactly what
we need here. That method seems to start importing from the topmost ancestor of a node, giving
as much context as possible, in contrast to both current and my suggested XSLTC approach,
which destroys ancestor references. That method also seems to employ caches in order to avoid
importing a document repeatedly. Sadly, actually using that method throws a ClassCastException
as it expects a DTM generate from a DOM source to be a DOM2DTM, which SAXImpl is not. A comment
inside that method also indicates that future implementations might drop auto-importing and
instead leave it to the caller to import a DOM if it hasn't been imported before.
> I left my own attempt at an nodeList2Iterator implementation using getDTMHandleFromNode
in place, renamed to nodeList2IteratorUsingHandleFromNode and made private. So it's there,
it even gets compiled, but it won't get used. If that method gets fixed in XSLTCDTMManager
or its ancestor, then this method might be used instead, giving a much simpler and cleaner
implementation. If some of my code would be useful for such an implementation as well, like
the check for "is same DOM", feel free to copy or move my code to those classes as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xalan.apache.org
For additional commands, e-mail: dev-help@xalan.apache.org


Mime
View raw message