jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Wider <pat_wi...@yahoo.fr>
Subject Binary Content Search Problem...
Date Thu, 18 Oct 2007 13:02:51 GMT
Hi all,
I'm setting up a new JackRabbit repository, which is backed by an Oracle DB for persistence.
The access to the created nodes and their properties are OK... except if I try to execute
(basic?) queries like:

 "/jcr:root//element(*, nt:resource)[(jcr:contains(., 'myKeyWord'))]"

which is supposed to return all nt:resource nodes whose jcr:data binary content contains 'myKeyWord',
isn't it?
But, it doesn't.... and I have no clue where I made the mistake.
 
First, I checked my workspace.xml file and particulary the SearchIndex property:

<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
            <param name="path" value="${wsp.home}/index"/>
           <param name="textFilterClasses" value="org.apache.jackrabbit.extractor.PlainTextExtractor,

                                                                          org.apache.jackrabbit.extractor.MsWordTextExtractor,
     
                                                                           ...  many more
extractors                                  "/>        
 </SearchIndex>

Then, I defined a node type as follow:
[wider:file] > 'nt:file', 'mix:referenceable'

Then I created a wider:file node as parent of the jcr:content node (nt:resource type):
Calendar cal = ...;
String mimetype=...;
File myFile = new File...;
InputStream inputstr = new FileInputStream(myFile);   
Node fileNode = myRoot.addNode(myFile.getName(), "wider:file");
Node resourceNode = fileNode.addNode("jcr:content", "nt:resource");
resourceNode.setProperty("jcr:mimeType", mimetype); //--> I made sure it is a good one

resourceNode.setProperty("jcr:encoding", "");
resourceNode.setProperty("jcr:lastModified", cal);
resourceNode.setProperty("jcr:data", inputstr);
mySession.save();

I made sure the mimetypes are OK... I have actually created 2 nt:resource nodes: one with
a Word Document (mimetype=application/msword) an the other one with a text file (mimetype=text/plain)...

Of course the files contain somehow 'myKeyWord'... the text file contains it for sure, but
in the Document, 'myKeyWord' is wrapped by bold and italic styles. But I don't think the styles
cause any problems... on the other hand, I have no idea how the extractors works ;-) it's
just a guess....

And, as said before, these nodes do exist in the repository... I can query them and their
properties and the jcr:data property can be roughly displayed... only the jcr:contains function
seems not to work.

Maybe you should also know that the externalBLOBs param is declared as false... and that I
use JackRabbit 1.3.1 with Lucene 2.0.0...

I really have no idea what I did wrong... thanx for your help
Regards, 
Patrick


      _____________________________________________________________________________ 
Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail 

Mime
View raw message