camel-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephan Siano (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CAMEL-8273) More flexible selection of default documentType in XPath expressions
Date Mon, 02 Feb 2015 06:39:34 GMT

    [ https://issues.apache.org/jira/browse/CAMEL-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14300929#comment-14300929
] 

Stephan Siano commented on CAMEL-8273:
--------------------------------------

Saxon cannot do XPath in streaming mode (I actually don't think that this is even possible
to have a full XPath implementation with streaming), but it supports XPath with TinyTree (which
is much smaller than the Xerces DOM). If the XML parsing is done during the XPath evaluation
(the document it provided not as a DOM tree but something else like InputSource) Saxon will
parse into that TinyTree, which was actually the purpose of my patch. Unfortunately I overlooked
the XXE thing.

I think I will check two things now:
1. whether Saxon will also allow XXE attacks if some non parsed type (like InputSource) is
used for the conversion
2. If that is the case convert to NodeInfo (which is the Saxon interface for DOM-Like nodes
(the TinyTree is a implementation of that)) and do the XPath parsing with that.

Both ways require to set the documentInfo parameter to something else than Document. Unfortunately
I don't see a way to do that automatically in case saxon is used...

> More flexible selection of default documentType in XPath expressions
> --------------------------------------------------------------------
>
>                 Key: CAMEL-8273
>                 URL: https://issues.apache.org/jira/browse/CAMEL-8273
>             Project: Camel
>          Issue Type: Improvement
>          Components: camel-core
>            Reporter: Stephan Siano
>            Assignee: Claus Ibsen
>             Fix For: Future
>
>         Attachments: 0001-CAMEL-8273-More-flexible-selection-of-default-docume.patch
>
>
> In the current implementation of XPath if no documentType is defined (likely in most
cases) the document used for XPath evaluation is parsed into a (DOM) Document using the JDK
XML parser before applying the XPath expression on it.
> For large documents this might be resource intensive, especially if the XPath is evaluated
using a more efficient parser like Saxon.
> With the current implementation it is possible to workaround this by setting a documentType
attribute to the XPath expression, but doing this efficiently requires some internal knowledge
about the previous component in the camel route (which type it creates) and the qualities
of the used XML parser (e.g. the JDK parser accepts only InputSource and Node as input types
for XPath evaluation whereas Saxon does also support other types like SAXSource).
> The attached patch will make the data type used by default for XPath evaluation more
flexible (depending on the type of the input).
> There are two cases to differentiate:
> documentType is set on the XPath expression:
> current implementation:
> 1. try to convert to the documentType
> 2. if that fails do some extra conversions for some additional data types (WrappedFile,
BeanInvocation, String)
> 3. if that fails throw an exception
> new implementation:
> 1. try to convert to the documentType
> 2. if that fails, use the message if it is of type Node, InputSource or DOMSource or
do some type conversions for specific data types (WrappedFile, BeanInvocation, String, InputStream,
Reader, byte[]...)
> 3. if that fails throw an exception
> documentType is not set on the XPath expresson
> old implementation:
> this is actually the same as if documentType was set to Document
> new implementation:
> 1. Use the message if it is of type Node, InputSource or DOMSource or do some type conversions
for specific data types (WrappedFile, BeanInvocation, String, InputStream, Reader, byte[]...)
(to InputSource)
> 2. If the old message is not of one of the types above, convert to DOM Document
> 3. If this fails throw an Exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message