Return-Path: Delivered-To: apmail-jackrabbit-users-archive@locus.apache.org Received: (qmail 30587 invoked from network); 14 Aug 2007 20:34:27 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 14 Aug 2007 20:34:27 -0000 Received: (qmail 60075 invoked by uid 500); 14 Aug 2007 20:34:24 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 60058 invoked by uid 500); 14 Aug 2007 20:34:24 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 60046 invoked by uid 99); 14 Aug 2007 20:34:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Aug 2007 13:34:24 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of gcaju-users@m.gmane.org designates 80.91.229.2 as permitted sender) Received: from [80.91.229.2] (HELO ciao.gmane.org) (80.91.229.2) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Aug 2007 20:34:19 +0000 Received: from list by ciao.gmane.org with local (Exim 4.43) id 1IL35J-0007wG-9R for users@jackrabbit.apache.org; Tue, 14 Aug 2007 22:33:49 +0200 Received: from gateway.subshell.com ([212.79.22.193]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 14 Aug 2007 22:33:49 +0200 Received: from christoph by gateway.subshell.com with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 14 Aug 2007 22:33:49 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: users@jackrabbit.apache.org From: Christoph Kiehl Subject: Re: Does jackrabbit suffer from n+1 SQL SELECT queries? Date: Tue, 14 Aug 2007 22:33:52 +0200 Lines: 29 Message-ID: References: <25133794.3061185914706820.JavaMail.root@mail.rhoderunner.com> <46C15D45.10906@gmx.net> <7d3e1a010708140856w38482cecv691343a083d44522@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: gateway.subshell.com User-Agent: Thunderbird 2.0.0.6 (Windows/20070728) In-Reply-To: <7d3e1a010708140856w38482cecv691343a083d44522@mail.gmail.com> Sender: news X-Virus-Checked: Checked by ClamAV on apache.org Brian Thompson wrote: > Would it be viable to add an option to retrieve everything in a given > NodeIterator as a single DB query when a DB persistence manager is used? It > seems to me that such an option would help out a lot for cases like > retrieving search results, getting all children of a node, etc. Sounds like an interesting addition. Maybe we could extend the persistence manager interface with some kind of bulk read method which takes a list of uuids as an argument. For DB PMs this will result in queries like "select ... from ... where uuid=uuid1 or uuid=uuid2 ..." or "select ... from ... where uuid in (uuid1, uuid2)". But those queries should still be faster then successive queries for each uuid. The question are: 1) How much faster is this in reality. It probably depends a lot on the kind of database you use, whether it is embedded, on the same machine or even remote. 2) In which situations can this bulk read methods be of any advantage? Since everything in Jackrabbit is build around loading nodes node by node only node iterators come to my mind right now. But there you have to decide how many nodes do you load? You never know how many nodes will be actually requested in the end. When querying you could probably give quite a good guess if you are using setLimit() but that's about it. For general node iterators you will probably fetch the nodes in batches of 10 or so. Not sure if that really gains you a lot. Just a few thoughts ;) Cheers, Christoph