Return-Path: Delivered-To: apmail-jackrabbit-users-archive@minotaur.apache.org Received: (qmail 91020 invoked from network); 27 Nov 2009 16:45:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 27 Nov 2009 16:45:37 -0000 Received: (qmail 1110 invoked by uid 500); 27 Nov 2009 16:45:36 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 1050 invoked by uid 500); 27 Nov 2009 16:45:35 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 1039 invoked by uid 99); 27 Nov 2009 16:45:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Nov 2009 16:45:35 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of gaeremyncks@gmail.com designates 209.85.220.215 as permitted sender) Received: from [209.85.220.215] (HELO mail-fx0-f215.google.com) (209.85.220.215) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Nov 2009 16:45:32 +0000 Received: by fxm7 with SMTP id 7so1855656fxm.9 for ; Fri, 27 Nov 2009 08:45:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:content-type:mime-version :subject:from:in-reply-to:date:content-transfer-encoding:message-id :references:to:x-mailer; bh=Bjw7mwmnK4A4TZQw9hic+vnDZGDO6LUrH4Ve7pT6y2U=; b=xSZBIzWwS2KjCzfxoJNDsvcde+OPWzuid8ev4lPByE7dbt1H9nLp11AhlMTk8nG4+R +9Nn58LxLNijw2lj9wvWKDj7umvYEDWRvU3R/bWcLVOmEXXOmqpZ+fkx55n4wTOGhEb+ JTY9Ydd9yMQWD+DVqfN+VGukanyd+8HipE5C0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; b=Mv9f6yc8bTPZEkHYwffKhyAoMaFqNKBrW3ulLooaaJ8dMiclQTU/OavXpJRjbz4sOY WgpDOXtizslclCCHK48qN1ir9oIdThKmiaGFFf9daR7IYO6F450iiMqcUNiL/3FTfXFX DehXX46/cEt8sgEv1uX1p6aGBWjUuRHFic/M8= Received: by 10.103.126.27 with SMTP id d27mr432056mun.56.1259340309151; Fri, 27 Nov 2009 08:45:09 -0800 (PST) Received: from ?10.0.0.65? (ginger.caret.cam.ac.uk [131.111.21.21]) by mx.google.com with ESMTPS id j6sm4825908mue.12.2009.11.27.08.45.08 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 27 Nov 2009 08:45:08 -0800 (PST) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes Mime-Version: 1.0 (Apple Message framework v1076) Subject: Re: Excerpts on search queries. From: Simon Gaeremynck In-Reply-To: Date: Fri, 27 Nov 2009 16:45:07 +0000 Content-Transfer-Encoding: 7bit Message-Id: <85E27883-0EB8-4E1F-8F27-3758A6824222@gmail.com> References: <4FF12BCA-F1E0-46D7-9332-F424561FA0F8@gmail.com> To: users@jackrabbit.apache.org X-Mailer: Apple Mail (2.1076) Thanks for the quick response! I am now able to retrieve excerpts in one query for both content and properties. According to http://issues.apache.org/jira/browse/JCR-920 it looks like it should work for properties (and it sort of does!) The only thing I have to figure out is how to only show the highlighted bit and the content around it for that specific property. This will probably need a custom excerpt provider. Thanks for the help! Simon On 27 Nov 2009, at 16:03, Alexander Klimetschek wrote: > On Sat, Nov 28, 2009 at 00:00, Simon Gaeremynck > wrote: >> I am trying to do a basic search query on the entire repository >> that returns >> any node that has a sling:resourceType property and that has a >> property/content that matches a search string. >> If there is a match I need to get all the properties of that node >> and an >> excerpt of the matched string. >> >> I have the following xpath query who returns the correct nodes: >> //*[@sling:resourceType and (jcr:contains(jcr:content,'*test*') or >> jcr:contains(.,'*test*'))] >> >> This query gives me the parent of the jcr:content node and all the >> properties on it, which is exactly what I want. >> >> Now, I thought I could get the excerpts out of the results by >> appending >> /rep:excerpt(.) like so: >> //*[@sling:resourceType and (jcr:contains(jcr:content,'*test*') or >> jcr:contains(.,'*test*'))]/rep:excerpt(.) >> >> But this doesn't give my a usable excerpt. > > I guess the excerpt is only useful when done on the jcr:content node. > So I think this query should do the trick: > > //*[@sling:resourceType and (jcr:contains(jcr:content,'test') or > jcr:contains(.,'test'))]/rep:excerpt() > > and then you'd call the rep:excerpt() with the jcr:content as > parameter: > > Row row = ... > Value excerpt = row.getValue("rep:excerpt(jcr:content)"); > > In this case you could also switch between the excerpt from a > jcr:content subnode or the node itself, depending on what exists. > > Note that I also changed "*test*" to "test" as that should give the > same results. jcr:contains does a full text search, not an exact > comparison, hence word stemming etc. all take place. > >> Another problem I'm facing with excerpts is when I try to get an >> excerpt out >> of a property/jcr:content with no (or almost none) content around the >> highlighted word/phrase. >> It seems to fill up the space by taking other property values of >> the node. > > You can have a custom excerpt provider: > http://wiki.apache.org/jackrabbit/ExcerptProvider > > Regards, > Alex > > -- > Alexander Klimetschek > alexander.klimetschek@day.com