cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simone Tripodi <simone.trip...@gmail.com>
Subject Re: XInclude optimization
Date Tue, 24 Nov 2009 10:01:13 GMT
Hi Sylvain
Sorry but I forgot to ask you a short question in the previous email:
can the Tika code be imported/modified into Cocoon3? AFAIK it should
be allowed, but I don't know the conditions under which it can be
done.
A bientot!!!
Simo

On Tue, Nov 24, 2009 at 10:29 AM, Simone Tripodi
<simone.tripodi@gmail.com> wrote:
> Hi Sylvain,
> there are no words to say thank you, very very appreciated, I'll
> follow your suggestions :)
> A bientot!!!!
> Simone
>
> On Tue, Nov 24, 2009 at 10:21 AM, Sylvain Wallez <sylvain@apache.org> wrote:
>> Simone Tripodi wrote:
>>>
>>> Hi Sylvain and Simone,
>>> thank you a lot, the suggestions you provided are all very very
>>> interesting, so I wonder now if it is possible to realize a processor
>>> able to use at the same time the Tika way when it recognizes some kind
>>> of paths, the "XSL-on-the-fly" for more complex cases. What do you
>>> think?
>>>
>>
>> As I suggested previously: first try to parse the XPath expression with
>> Tika's parser, and if it fails because the expression doesn't match the
>> subset it accepts, fall back to XSL-on-the-fly.
>>
>> Looking at Tika's parser [1], it looks like you'll have to overload the
>> parse() method to fail hard by throwing an exception rather than returning
>> Matcher.FAIL to be able to detect XPath features outside of the subset it
>> accepts.
>>
>>> Sylvain, I still haven't read the Tika documentation, can you just
>>> point me the related doc about this topic?
>>>
>>
>> There's no specific documentation on this particular feature, as its more an
>> internal utility than a primary feature in Tika. Now the code is pretty
>> straightforward.
>>>
>>> Simo, did you already give a try about the XSLT generation on the fly?
>>> The most basic operation I thought is generating the XSL string by a
>>> template, then pass it to the XSL parser, but I'm sure it could be
>>> implemented in a better way :P
>>>
>>
>> Sounds like the way to go, but you should cache the resulting template
>> object to avoid recreating and reparsing the XSL at every request. The same
>> applies to Tika matcher objects.
>>
>> Sylvain
>>
>> [1]
>> https://svn.apache.org/repos/asf/lucene/tika/trunk/tika-core/src/main/java/org/apache/tika/sax/xpath/XPathParser.java
>>
>> --
>> Sylvain Wallez - http://bluxte.net
>>
>>
>
>
>
> --
> http://www.google.com/profiles/simone.tripodi
>



-- 
http://www.google.com/profiles/simone.tripodi

Mime
View raw message