cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simone Tripodi <>
Subject Re: XInclude optimization
Date Tue, 24 Nov 2009 10:01:13 GMT
Hi Sylvain
Sorry but I forgot to ask you a short question in the previous email:
can the Tika code be imported/modified into Cocoon3? AFAIK it should
be allowed, but I don't know the conditions under which it can be
A bientot!!!

On Tue, Nov 24, 2009 at 10:29 AM, Simone Tripodi
<> wrote:
> Hi Sylvain,
> there are no words to say thank you, very very appreciated, I'll
> follow your suggestions :)
> A bientot!!!!
> Simone
> On Tue, Nov 24, 2009 at 10:21 AM, Sylvain Wallez <> wrote:
>> Simone Tripodi wrote:
>>> Hi Sylvain and Simone,
>>> thank you a lot, the suggestions you provided are all very very
>>> interesting, so I wonder now if it is possible to realize a processor
>>> able to use at the same time the Tika way when it recognizes some kind
>>> of paths, the "XSL-on-the-fly" for more complex cases. What do you
>>> think?
>> As I suggested previously: first try to parse the XPath expression with
>> Tika's parser, and if it fails because the expression doesn't match the
>> subset it accepts, fall back to XSL-on-the-fly.
>> Looking at Tika's parser [1], it looks like you'll have to overload the
>> parse() method to fail hard by throwing an exception rather than returning
>> Matcher.FAIL to be able to detect XPath features outside of the subset it
>> accepts.
>>> Sylvain, I still haven't read the Tika documentation, can you just
>>> point me the related doc about this topic?
>> There's no specific documentation on this particular feature, as its more an
>> internal utility than a primary feature in Tika. Now the code is pretty
>> straightforward.
>>> Simo, did you already give a try about the XSLT generation on the fly?
>>> The most basic operation I thought is generating the XSL string by a
>>> template, then pass it to the XSL parser, but I'm sure it could be
>>> implemented in a better way :P
>> Sounds like the way to go, but you should cache the resulting template
>> object to avoid recreating and reparsing the XSL at every request. The same
>> applies to Tika matcher objects.
>> Sylvain
>> [1]
>> --
>> Sylvain Wallez -
> --


View raw message