jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Reutegger <marcel.reuteg...@gmx.net>
Subject Re: Excerpts Question
Date Wed, 04 Jun 2008 08:34:51 GMT
Hi Marc,

please check the following points:

- configuration changes to repository.xml only affect newly created workspaces, 
make sure you changed any existing workspace.xml files

- changes to parameters 'suppportHightlighting' and 'textFilterClasses' require 
that you re-index the workspace, otherwise only newly added resources are 
indexed according to the new value.

regards
  marcel

Marc Schriftman wrote:
> Hey y'all
> 
> A quick excerpt question, if you don't mind. I've configured my repository
> for excerpts:
> 
> <param name="supportHighlighting" value="true"/>
> <param name="excerptProviderClass"
> value="org.apache.jackrabbit.core.query.lucene.DefaultHTMLExcerpt"/>
> <param name="textFilterClasses" value="
>             org.apache.jackrabbit.extractor.HTMLTextExtractor,
>             org.apache.jackrabbit.extractor.MsExcelTextExtractor,
>             org.apache.jackrabbit.extractor.MsPowerPointTextExtractor,
>             org.apache.jackrabbit.extractor.MsWordTextExtractor,
>             org.apache.jackrabbit.extractor.PdfTextExtractor,
>             org.apache.jackrabbit.extractor.PlainTextExtractor
> "/>
> 
> and my code looks like this:
> 
> Query query = queryManager.createQuery("//element(*,
> nt:resource)[jcr:contains(., '" + partial +
> "')]/(@jcr:uuid|rep:excerpt(.))", Query.XPATH);
> RowIterator iter = query.execute().getRows();
> while (iter.hasNext()) {
> final Row row = iter.nextRow();
> final String uuid = row.getValue("jcr:uuid").getString();
> final String excerpt = row.getValue("rep:excerpt(.)").getString();
> getWriter().println(excerpt);
> 
> and this is what I'm getting:
> 
> <excerpt><fragment>238b244d-8ed2-4e6b-b319-1c26256eb580 ...
> 63f7bdc2-0667-4366-bed8-5c0928fba5d2 ...
> application/vnd.ms-powerpoint</fragment></excerpt>
> <excerpt><fragment>0affc599-1dfc-4813-8c57-93a8d6349226 ...
> f00a9ba8-7e69-4337-be02-49fcffc6fb72 ...
> application/pdf</fragment></excerpt>
> 
> 
> Anyone know what I'm doing wrong? It feels like it might be configuration
> related, since that's not even the correct format for the
> DefaultHTMLExcerpt, but what's with the guid weirdness?
> 
> Thanks in advance,
> 
> Marc Schriftman
> 


Mime
View raw message