cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simone Tripodi <simone.trip...@gmail.com>
Subject Re: XInclude optimization
Date Sun, 22 Nov 2009 15:57:30 GMT
Hi Sylvain,
thanks for your kind reply! I suspected the XPath limitations you
explained very well, but deeply in my heart I was hoping to a solution
I didn't know yet, for this reason I asked it :P :P

I'll take a look at both the solutions, eve if the first sounds to me
more compliant to the xpointer recommendation and at the same time
closer with what I already did - and to older XInclude cocoon
implementations.

Thank you very much for your hints, very well appreciated :)
A bientot!
Simone

P.S. Offtopic: maybe I'm wrong, but I'm sure we met once in Tolouse, I
was one of the Asemantics juniors involved in Joost :P

On Sun, Nov 22, 2009 at 3:27 PM, Sylvain Wallez <sylvain@apache.org> wrote:
> Simone Tripodi wrote:
>>
>> Hi all guys,
>> I'm very sorry if I don't appear frequently on the ML but since April
>> I've been working very hard for a customer client in Paris that don't
>> let me some spare time to dedicate to OS projects.
>>
>
> Don't be sorry. We all have our own jobs/interest/duties that have driven us
> away from Cocoon. Glad to see you back!
>
>> I'm writing because I'm sure the XInclude transformer I submitted time
>> ago could be optimized, so I'd like to ask you a little help :)
>>
>> The state of the art is that, when including an entire document, it is
>> processed efficiently through SAX APIs; the problem comes when
>> processing a document referenced by xinclude+xpointer, that forces the
>> processor to extract a sub-document of the included.
>>
>> To perform this, I implemented a DOM parsing, then through XPath I
>> extract the sub-document the processor has to be included, then
>> navigating the elements will be converted to SAX events. As you
>> noticed, this takes time, too much IMO, but I didn't find/don't know
>> any better solution :(
>> Since you experienced the stax, maybe you're able to suggest me a fast
>> way to parse a document with xpath and invoke SAX events, so I'm able
>> to provide you a much better - and faster, above all - solution.
>>
>> Any hint? Every suggestion will be very appreciated.
>>
>
> The problem with XPath and XML streaming (be it SAX or StAX) is that XPath
> is a language that allows exploring the document tree in all directions and
> thus inherently expects having the whole document tree available, which is
> clearly not compatible with streaming.
>
> There are different approaches to solving this :
> - use a deferred loading DOM implementation, which buffers events only when
> it needs them to traverse the tree. Axiom [1] provides this IIRC, along with
> an XPath implementation.
> - restrain the XPointer expression to a subset of XPath that can easily be
> implemented on top of a stream. This means restricting selection only on the
> current element, its attribute and its ancestors. There's an implementation
> of this approach in Tika.
>
> The XInclude transformer can be smart enough to use the most efficient
> implementation for the given XPath expression, i.e. try to parse it with
> Tika's restricted subset, and fallback to something more costly, either
> Axiom or plain DOM.
>
> Sylvain
>
> [1] http://ws.apache.org/commons/axiom/
> [2]
> https://svn.apache.org/repos/asf/lucene/tika/trunk/tika-core/src/main/java/org/apache/tika/sax/xpath/
>
> --
> Sylvain Wallez - http://bluxte.net
>
>



-- 
http://www.google.com/profiles/simone.tripodi

Mime
View raw message