Return-Path: Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: (qmail 79963 invoked from network); 26 Nov 2007 10:18:55 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 26 Nov 2007 10:18:55 -0000 Received: (qmail 73353 invoked by uid 500); 26 Nov 2007 10:18:43 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 73324 invoked by uid 500); 26 Nov 2007 10:18:43 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 73315 invoked by uid 99); 26 Nov 2007 10:18:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Nov 2007 02:18:42 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of marcel.reutegger@gmx.net designates 213.165.64.20 as permitted sender) Received: from [213.165.64.20] (HELO mail.gmx.net) (213.165.64.20) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 26 Nov 2007 10:18:20 +0000 Received: (qmail invoked by alias); 26 Nov 2007 10:18:21 -0000 Received: from bsl-rtr.day.com (EHLO [10.0.0.68]) [62.192.10.254] by mail.gmx.net (mp053) with SMTP; 26 Nov 2007 11:18:21 +0100 X-Authenticated: #894343 X-Provags-ID: V01U2FsdGVkX1+akKRg8IGaODsgn3n9clwlC04kQhrBQF+3IW/ZJv k0QZ01eAqaLCVX Message-ID: <474A9D6B.7030006@gmx.net> Date: Mon, 26 Nov 2007 11:18:19 +0100 From: Marcel Reutegger User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: dev@jackrabbit.apache.org Subject: Re: Indexing of properties setProperty(String, InputStream) References: <474A9324.1050408@wyona.com> In-Reply-To: <474A9324.1050408@wyona.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 X-Virus-Checked: Checked by ClamAV on apache.org Hi Michael, these are rather questions for the user list, but anyway... Michael Wechner wrote: > I am using setProperty(String, InputStream) resp. setProperty("content", > new InputStream(...)) in order to save XHTML and other "bigger" content. > Also I am using the TransientRepository implementation. > > When I am searching with xpath, something like //*[@content] then I > don't receive any results whereas properties being set with > setProperty(String, String) are being found. > > Now I am very sure the "content" properties do exist, because I read and > write to them without a problem. > > So my guess is that properties being set through setProperty(String, > InputStream) are not being indexed by default, because it could be any > kind of data, right? that's correct. the JCR specification says that binary properties are not indexed. basically because of the reason you mentioned. it can be anything... > But I can get them indexed? yes, if you store the binary as a nt:resource node. this will give jackrabbit the required information how to index the binary (mime-type and encoding). furthermore you need to configure text extractors in the configuration. http://jackrabbit.apache.org/doc/components/text-extractors.html > Shall I rather use > setProperty(String, Value, int) and set the type to String and use > Value.getStream() ? that's an alternative, but then you will get matches for tag names as well. while you are probably only interested in the text between the elements and attribute values. regards marcel