jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting" <jukka.zitt...@gmail.com>
Subject Re: how can I say to jackrabbit to index a text when I put a TIFF in the repository?
Date Fri, 28 Mar 2008 06:57:27 GMT
Hi,

On Fri, Mar 28, 2008 at 8:43 AM, Paco Avila <pavila@git.es> wrote:
> El vie, 28-03-2008 a las 08:26 +0200, Jukka Zitting escribió:
>  > Or just a normal string property with the text to be indexed.
>
>  But, in this case, the query can't be:
>
>   /jcr:root//element(*,my:document)[jcr:contains(nt:resource,'hola
>  mundo')]
>
>  and should be something like (if I store the text in my:docText
>  property:
>
>   /jcr:root//element(*,my:document)[jcr:contains(my:docText,'hola
>  mundo')]
>
>  because Lucene is not indexing the "document text version".

You could use jcr:contains(., 'hola mundo') that looks in all
properties of a node.

Alternatively, you could also put the text in a TIFF comment and
implement a custom TextExtractor class that pulls that comment for
Jackrabbit to index as the text version of the TIFF file.

>  By the way, can I get the text generated by text-extractors or
>  it is only used by Lucene engine?

No, it's only used for Lucene. But of course you can instantiate and
run the text extractors manually on any binary property you like.

BR,

Jukka Zitting

Mime
View raw message