jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Herman <MHer...@NBME.org>
Subject Re: textFilterClasses deprecated. How to specify extractors?
Date Fri, 06 Apr 2012 00:52:52 GMT

Robert Siska wrote
> How does it know, what binary files it should index, when I'm not
> specifying no extractors? How can I disable/enable them?

I'm not an expert but what I do know that JR uses Tika to extract text, and
it determines how based on the jcr:mimeType property. If you don't supply
mimetype, then it won't know how to extract it (although I wouldn't
recommend that as a practice). I believe there is a way to supply  JR with a
Tika config that might give you what you want.

Additionally you can specify a indexing config in the repository/workspace
xml files that you can set some rules on what gets indexed and how by

View this message in context: http://jackrabbit.510166.n4.nabble.com/textFilterClasses-deprecated-How-to-specify-extractors-tp4534050p4536443.html
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.

View raw message