cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Wallez <>
Subject Re: XInclude optimization
Date Tue, 24 Nov 2009 09:21:41 GMT
Simone Tripodi wrote:
> Hi Sylvain and Simone,
> thank you a lot, the suggestions you provided are all very very
> interesting, so I wonder now if it is possible to realize a processor
> able to use at the same time the Tika way when it recognizes some kind
> of paths, the "XSL-on-the-fly" for more complex cases. What do you
> think?

As I suggested previously: first try to parse the XPath expression with 
Tika's parser, and if it fails because the expression doesn't match the 
subset it accepts, fall back to XSL-on-the-fly.

Looking at Tika's parser [1], it looks like you'll have to overload the 
parse() method to fail hard by throwing an exception rather than 
returning Matcher.FAIL to be able to detect XPath features outside of 
the subset it accepts.

> Sylvain, I still haven't read the Tika documentation, can you just
> point me the related doc about this topic?

There's no specific documentation on this particular feature, as its 
more an internal utility than a primary feature in Tika. Now the code is 
pretty straightforward.
> Simo, did you already give a try about the XSLT generation on the fly?
> The most basic operation I thought is generating the XSL string by a
> template, then pass it to the XSL parser, but I'm sure it could be
> implemented in a better way :P

Sounds like the way to go, but you should cache the resulting template 
object to avoid recreating and reparsing the XSL at every request. The 
same applies to Tika matcher objects.



Sylvain Wallez -

View raw message