Return-Path: Delivered-To: apmail-jackrabbit-users-archive@locus.apache.org Received: (qmail 45624 invoked from network); 5 Apr 2007 06:35:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 5 Apr 2007 06:35:54 -0000 Received: (qmail 98593 invoked by uid 500); 5 Apr 2007 06:36:00 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 98578 invoked by uid 500); 5 Apr 2007 06:36:00 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 98569 invoked by uid 99); 5 Apr 2007 06:36:00 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Apr 2007 23:36:00 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of gcaju-users@m.gmane.org designates 80.91.229.2 as permitted sender) Received: from [80.91.229.2] (HELO ciao.gmane.org) (80.91.229.2) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Apr 2007 23:35:51 -0700 Received: from list by ciao.gmane.org with local (Exim 4.43) id 1HZLZ4-0003LR-Sq for users@jackrabbit.apache.org; Thu, 05 Apr 2007 08:35:22 +0200 Received: from gateway.subshell.com ([212.79.22.193]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 05 Apr 2007 08:35:22 +0200 Received: from christoph by gateway.subshell.com with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 05 Apr 2007 08:35:22 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: users@jackrabbit.apache.org From: Christoph Kiehl Subject: Re: How to find similar nodes(somewhat like google similar pages)? Date: Thu, 05 Apr 2007 08:34:46 +0200 Lines: 16 Message-ID: References: <9850410.post@talk.nabble.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: gateway.subshell.com User-Agent: Thunderbird 1.5.0.10 (Windows/20070221) In-Reply-To: <9850410.post@talk.nabble.com> Sender: news X-Virus-Checked: Checked by ClamAV on apache.org alartin wrote: > Hi all, > Given a node with a text content, I want to find nodes that have similar > text contents. It somewhat is like finding similar pages using google(just > think page is a kind of node, its content is a node property). I think > lucene support this by term vector and wonder whether jackrabbit query can > do it or not. Many thanks. Right now Jackrabbit doesn't support this, probably mainly because the JCR spec doesn't define such a thing. I think Marcel recently added search term highlighting which isn't definded by the JCR spec as well. May be one could add a custom xpath function? But I'm not quite sure if this is a funtionality a content repository should be responsible for. Cheers, Chris