Return-Path: Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: (qmail 43239 invoked from network); 24 Apr 2006 13:50:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 24 Apr 2006 13:50:14 -0000 Received: (qmail 53996 invoked by uid 500); 24 Apr 2006 13:50:11 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 53950 invoked by uid 500); 24 Apr 2006 13:50:10 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 53939 invoked by uid 99); 24 Apr 2006 13:50:10 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Apr 2006 06:50:10 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [213.56.215.224] (HELO gateway.nuxeo.com) (213.56.215.224) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Apr 2006 06:50:09 -0700 Received: from [192.168.2.8] (firefly.in.nuxeo.com [192.168.2.8]) by gateway.nuxeo.com (Postfix) with ESMTP id 9F7CA1DDCE for ; Mon, 24 Apr 2006 15:49:48 +0200 (CEST) Message-ID: <444CD777.8000401@nuxeo.com> Date: Mon, 24 Apr 2006 15:49:43 +0200 From: Florent Guillaume User-Agent: Thunderbird 1.5 (Macintosh/20051201) MIME-Version: 1.0 To: dev@jackrabbit.apache.org Subject: Re: efficient note type indexing References: <4898018E-C69A-4302-9065-5CB61ECA21A5@nuxeo.com> <4445F6E7.4060807@gmx.net> In-Reply-To: <4445F6E7.4060807@gmx.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Hi, Marcel Reutegger wrote: > Florent Guillaume wrote: >> I have a node that has lots of unordered children nodes. Some of these >> nodes are "real children" in the document management sense, the others >> (in small number) are just nodes that hold complex datatypes but are >> really part of the main document. >> >> I'd like to access both categories of nodes in an efficient manner: >> - get only the nodes for my complex datatypes, >> - get the list of "real children" nodes. > > when you say 'get a list of child nodes' isn't it easier just using the > api instead of a query? Node.getNodes() and then have a custom > NodeIterator that filters out unnecessary nodes? An iterator that filters while iterating would be ok when most of the nodes match, but in the case where the nodes that I want are those in small numbers (and which may be at the end of the iterator list), it's inefficient. That's why I mentioned indexed queries. >> I have flexibility in deciding how these node are typed. I can have >> mixin types that are used as marker interface for these two >> categories. Or (preferably) I can rely on the supertypes for my node >> types to distinguish between the two. >> >> What would you recommend so that my queries are processed efficiently, >> using underlying indexes? > > using different types for the child nodes is definitively a good idea, > as it helps narrowing down the set of nodes that may match. If I have the (non-mixin) types: [my:bar] ... [my:foo] > my:bar ... [my:gee] > my:bar ... the spec (6.6.3.2) tells me that I can query //element(*, my:bar) and I'll get my:foo and my:gee nodes too. But is this implemented in jackrabbit using efficient indexes, or is there an iteration and comparison going on? Thanks, Florent -- Florent Guillaume, Nuxeo (Paris, France) Director of R&D +33 1 40 33 71 59 http://nuxeo.com fg@nuxeo.com