jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Reutegger <marcel.reuteg...@gmx.net>
Subject Re: Setting up search
Date Tue, 03 Oct 2006 18:39:25 GMT
Philip Q wrote:
> 1. There are two (major) custom nodetypes in the repository, one 
> designed to store plain-text data, the other is designed to store binary 
> data (just in properties of those nodetypes).
> Do I need to do anything special to convince Jackrabbit to index these 
> nodes? What about processing the binary data with the textfilters?

the binary data must be stored a node of type nt:resource or as a sub 
type thereof.

then you need to configure the text filter classes in repository.xml 
(and any existing workspace.xml files) that you want to use.
See also:
https://svn.apache.org/repos/asf/jackrabbit/trunk/textfilters/README.txt

> 2. How would I search the binary data? Would a standard XPath query work 
> on it as-if it was plain-text (assuming it was an PDF/Word/parseable file)?

once the resource nodes went through the text filters you can search 
binary content using the jcr:contains function:

//element(*, nt:resource)[jcr:contains(., 'foo')]

> 3. Since I can't do a full-text query with the JCR/Jackrabbit interface, 
> but I assume I can just use Lucene to open the index. If I did, what 
> field(s) would I need to query, and what kind of path would I get back? 
> (A pointer to the source code that does this would also be very helpful).

JCR *does* provide a way to execute a fulltext query. see above. There 
is no need to query the underlying index directly using plain lucene.

regards
  marcel

Mime
View raw message