cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simone Tripodi <simone.trip...@gmail.com>
Subject Re: XInclude optimization
Date Mon, 23 Nov 2009 19:11:29 GMT
Hi Sylvain and Simone,
thank you a lot, the suggestions you provided are all very very
interesting, so I wonder now if it is possible to realize a processor
able to use at the same time the Tika way when it recognizes some kind
of paths, the "XSL-on-the-fly" for more complex cases. What do you
think?

Sylvain, I still haven't read the Tika documentation, can you just
point me the related doc about this topic?

Simo, did you already give a try about the XSLT generation on the fly?
The most basic operation I thought is generating the XSL string by a
template, then pass it to the XSL parser, but I'm sure it could be
implemented in a better way :P

Every suggestion will be very appreciated, thanks in advance

Best regards, have a nice evening!!!
Simone

On Mon, Nov 23, 2009 at 7:16 PM, Sylvain Wallez <sylvain@apache.org> wrote:
> Simone Gianni wrote:
>>
>> Hi Simone and Sylvain,
>> aren't XSLT transformers already SAX/Xpath optimized? I mean, an XSLT
>> containing an XPath expression and used in a SAX context, isn't already able
>> to resolve the XPath while keeping buffering at the minimum possible?
>>
>> I can clearly remember that there has been a lot of work about this in
>> Xalan and other XSLT engines, and also how a complex XPath expressions could
>> change the performance of a transformation because of increased buffering.
>
> Xalan has an optimized implementation of the document tree [1], more
> efficient than the standard DOM for read-only and selection operations.
> Xalan has an incremental processing mode, but IIRC it's more about being
> able to produce some output before the whole document has been read rather
> than avoiding to build parts of the document tree. So it will allow for
> faster processing, but won't change memory consumption.
>
>> In that case, maybe, instead of reinventing it, it should be possible to
>> delegate the "transformation" (extraction of a fragment from the entire XML
>> stream) to an XSLT processor. The simplest way could be to generate an XSLT
>> on the fly :) .. the correct way would be to use the [Xalan|Saxon|any other]
>> internal APIs to perform the XPath resolution. In both cases, it will be
>> faster than transforming to DOM.
>
> Agree. It may be easier to produce a small XSL transformation from the
> XPointer expression than using Axiom. But still, for simple expressions, the
> pure streaming approach used by Tika would be way more efficient.
>
> Sylvain
>
> [1] http://xml.apache.org/xalan-j/dtm.html
>
> --
> Sylvain Wallez - http://bluxte.net
>
>



-- 
http://www.google.com/profiles/simone.tripodi

Mime
View raw message