From users-return-16080-apmail-jackrabbit-users-archive=jackrabbit.apache.org@jackrabbit.apache.org Thu Sep 09 14:57:16 2010 Return-Path: Delivered-To: apmail-jackrabbit-users-archive@minotaur.apache.org Received: (qmail 28672 invoked from network); 9 Sep 2010 14:57:16 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 9 Sep 2010 14:57:16 -0000 Received: (qmail 8630 invoked by uid 500); 9 Sep 2010 14:57:15 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 8374 invoked by uid 500); 9 Sep 2010 14:57:13 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 7453 invoked by uid 99); 9 Sep 2010 14:57:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Sep 2010 14:57:12 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=10.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of a.schrijvers@1hippo.com designates 64.18.2.6 as permitted sender) Received: from [64.18.2.6] (HELO exprod7og117.obsmtp.com) (64.18.2.6) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 09 Sep 2010 14:57:05 +0000 Received: from source ([209.85.215.41]) by exprod7ob117.postini.com ([64.18.6.12]) with SMTP ID DSNKTIj1q3xFl2tHylXRyuySwh0yta1nH4Cp@postini.com; Thu, 09 Sep 2010 07:56:44 PDT Received: by mail-ew0-f41.google.com with SMTP id 28so1098982ewy.0 for ; Thu, 09 Sep 2010 07:56:43 -0700 (PDT) MIME-Version: 1.0 Received: by 10.213.13.142 with SMTP id c14mr160970eba.84.1284044203076; Thu, 09 Sep 2010 07:56:43 -0700 (PDT) Received: by 10.213.104.146 with HTTP; Thu, 9 Sep 2010 07:56:43 -0700 (PDT) In-Reply-To: References: Date: Thu, 9 Sep 2010 16:56:43 +0200 Message-ID: Subject: Re: About jcr:contains for binary content and properties From: Ard Schrijvers To: users@jackrabbit.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org On Thu, Sep 9, 2010 at 2:43 PM, Paco Avila wrote: > In the Jackrabbit FAQ I see this question: > > Why doesn't //*[jcr:contains(@jcr:data, 'foo')] return matches for > binary content? Extracted text from binary content is only indexed on > the parent node of the @jcr:data property. Use jcr:contains() on the > nt:resource node. > > My problem in that this query: > > //element(*, nt:file)[jcr:contains(jcr:content, 'foo')] > > will also match jcr:content nodes with the 'foo' text in their > properties. For example, if I want to find document with the word > "pdf" inside, will also match documents which jcr:mimeType is > 'application/pdf'. > > How can I search only by the binary indexed data? You would need to hook into the indexing then. I answered a similar question somewhere around 2 weeks ago. You should be able to find this back in the archive. Regards Ard > -- > OpenKM > http://www.openkm.com > http://www.guia-ubuntu.org >