jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Reutegger <marcel.reuteg...@gmx.net>
Subject Re: Search in binary Content
Date Wed, 05 Mar 2008 14:56:03 GMT
Hi Katia,

Katia Santos wrote:
> I´m trying to search in PDF binary content, the text is being extracted, but
> when I do the query, I get no results :(
> Do anyone has the same problem, or anyone knows what the problem is?
> 
> my query is:
> 
> //*[jcr:contains(.,'myword')]

Did you set the testFilterClasses parameter in your workspace.xml? Please also 
make sure you put all depending jar files into your classpath.

Here's a list of supported classes and the corresponding mime types that are 
recognized:
http://jackrabbit.apache.org/jackrabbit-text-extractors.html

See also query section in the FAQ:
http://jackrabbit.apache.org/frequently-asked-questions.html

regards
  marcel

> I have another problem....When the text is being extracted, in xls, odt,
> odp, and ods files  works fine, but in pdf, xml, txt, rtf , doc, ppt doesnt
> :(
> No text is extracted in this last file types. If some one could help me wiht
> that...
> 
> Thanks
> 


Mime
View raw message