cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simone Tripodi <>
Subject Re: XInclude optimization
Date Sun, 22 Nov 2009 15:57:30 GMT
Hi Sylvain,
thanks for your kind reply! I suspected the XPath limitations you
explained very well, but deeply in my heart I was hoping to a solution
I didn't know yet, for this reason I asked it :P :P

I'll take a look at both the solutions, eve if the first sounds to me
more compliant to the xpointer recommendation and at the same time
closer with what I already did - and to older XInclude cocoon

Thank you very much for your hints, very well appreciated :)
A bientot!

P.S. Offtopic: maybe I'm wrong, but I'm sure we met once in Tolouse, I
was one of the Asemantics juniors involved in Joost :P

On Sun, Nov 22, 2009 at 3:27 PM, Sylvain Wallez <> wrote:
> Simone Tripodi wrote:
>> Hi all guys,
>> I'm very sorry if I don't appear frequently on the ML but since April
>> I've been working very hard for a customer client in Paris that don't
>> let me some spare time to dedicate to OS projects.
> Don't be sorry. We all have our own jobs/interest/duties that have driven us
> away from Cocoon. Glad to see you back!
>> I'm writing because I'm sure the XInclude transformer I submitted time
>> ago could be optimized, so I'd like to ask you a little help :)
>> The state of the art is that, when including an entire document, it is
>> processed efficiently through SAX APIs; the problem comes when
>> processing a document referenced by xinclude+xpointer, that forces the
>> processor to extract a sub-document of the included.
>> To perform this, I implemented a DOM parsing, then through XPath I
>> extract the sub-document the processor has to be included, then
>> navigating the elements will be converted to SAX events. As you
>> noticed, this takes time, too much IMO, but I didn't find/don't know
>> any better solution :(
>> Since you experienced the stax, maybe you're able to suggest me a fast
>> way to parse a document with xpath and invoke SAX events, so I'm able
>> to provide you a much better - and faster, above all - solution.
>> Any hint? Every suggestion will be very appreciated.
> The problem with XPath and XML streaming (be it SAX or StAX) is that XPath
> is a language that allows exploring the document tree in all directions and
> thus inherently expects having the whole document tree available, which is
> clearly not compatible with streaming.
> There are different approaches to solving this :
> - use a deferred loading DOM implementation, which buffers events only when
> it needs them to traverse the tree. Axiom [1] provides this IIRC, along with
> an XPath implementation.
> - restrain the XPointer expression to a subset of XPath that can easily be
> implemented on top of a stream. This means restricting selection only on the
> current element, its attribute and its ancestors. There's an implementation
> of this approach in Tika.
> The XInclude transformer can be smart enough to use the most efficient
> implementation for the given XPath expression, i.e. try to parse it with
> Tika's restricted subset, and fallback to something more costly, either
> Axiom or plain DOM.
> Sylvain
> [1]
> [2]
> --
> Sylvain Wallez -


View raw message