jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Brosius" <dbros...@mebigfatguy.com>
Subject RE: Searching within Word Documents
Date Thu, 24 Apr 2008 21:58:45 GMT

Also, make sure that you are specifying the correct mimetype for the nt:file

word can be

"application/vnd.ms-word" or "application/msword"

and pdf is

application/pdf

-----Original Message-----
From: Dave Brosius <dbrosius@mebigfatguy.com>
Sent: Thursday, April 24, 2008 5:48pm
To: users@jackrabbit.apache.org
Cc: amabogunje@primavera.com
Subject: RE: Searching within Word Documents

Check the textFilterClasses sub element of SearchIndex in the repository.xml to see what text
extractors you are using.

org.apache.jackrabbit.extractor.MsWordTextExtractor

should be there for word documents.

org.apache.jackrabbit.extractor.PdfTextExtractor

should be there for pdfs




-----Original Message-----
From: mabogunje <amabogunje@primavera.com>
Sent: Wednesday, April 23, 2008 10:06am
To: users@jackrabbit.apache.org
Subject: Searching within Word Documents


Is jackrabbit able to search within the contents of a MS Word document ? I
have a test that searches the jcr:content on plain text, Excel (csv) and
Word documents, however the Word doc search always return no results.
Wondering if this is supported by JackRabbit ?


PS : Same question with pdfs ?
-- 
View this message in context: http://www.nabble.com/Searching-within-Word-Documents-tp16834621p16834621.html
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.






Mime
View raw message