jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kurz Wolfgang <wolfgang.k...@gwvs.de>
Subject Problem getting full textual search to work with textextractors
Date Thu, 26 Mar 2009 15:20:17 GMT
Hello everyone,

i am trying to get the full textual search to work with text extractors.


I uploaded a pfd-file as resource into jackrabbit which works fine as I can download it just
fine and I get the file back.

But now I wanted to implement textual search inside document I uploaded and somehow it doesn't
find the documents even though the document contains the strings that I am searching for.

What I did I this:

I added these jar files to my tomcat server lib folder since I am using JNDI to connect

-jackrabbit-text-extractors-1.5.0.jar
-fontbox-0.1.0.jar
-junit-3.8.1.jar
-nekohtml-1.9.7.jar
-pdfbox-0.7.3.jar
-poi-3.0.2-FINAL.jar
-poi-scratchpad-3.0.2-FINAL.jar
-tm-extractors-0.4.jar

Then my x-path query looks like this:

//*[((jcr:contains(.,'consetetur')) or (jcr:contains(.,'sadipscing')))]

Both of those words are inside the pdf but the search result is empty.

Here is the code how I do the search:

javax.jcr.query.Query jcrQuery;
		try {
			jcrQuery = session.getWorkspace().getQueryManager().createQuery(query, language);
			QueryResult queryResult = jcrQuery.execute();
			NodeIterator nodeIterator = queryResult.getNodes();
			return nodeIterator;
		}
		catch (InvalidQueryException iqe) {
			throw new org.apache.jackrabbit.ocm.exception.InvalidQueryException(iqe);
		}
		catch (RepositoryException re) {
			throw new ObjectContentManagerException(re.getMessage(), re);
		}


Would be really awesome if anyone had an idea for me why this doesn't work

Thx a lot in advance
Wolfgang

Mime
View raw message