Return-Path: Delivered-To: apmail-jackrabbit-users-archive@locus.apache.org Received: (qmail 40292 invoked from network); 2 Oct 2008 10:42:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 2 Oct 2008 10:42:21 -0000 Received: (qmail 77317 invoked by uid 500); 2 Oct 2008 10:42:17 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 77244 invoked by uid 500); 2 Oct 2008 10:42:15 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 77233 invoked by uid 99); 2 Oct 2008 10:42:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Oct 2008 03:42:15 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sridhar.raman@gmail.com designates 209.85.142.189 as permitted sender) Received: from [209.85.142.189] (HELO ti-out-0910.google.com) (209.85.142.189) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Oct 2008 10:41:14 +0000 Received: by ti-out-0910.google.com with SMTP id d27so969392tid.9 for ; Thu, 02 Oct 2008 03:41:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=PV3dglw7aPz21rZxemzzfLB5THDm24rH2xVx5BE6vXA=; b=V73quwRIfm4qaWCOmwmSgYV49V07UgXs8XVrbee3Xh2tIaJ+pVP21nJCGIsY8L/KMQ b00di+YBMF4Q5qhSoyXgAE3JGFtsouVd0ofwQtM3orBjI+4DHmM6JpVpn1k+/7GUIYoy 0QMmnEQrOvxnoCFy+FD7osOC9C+cPf3ywdBMw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=XUVNjZdqGP3lRcce91E+zfWWOAZhcCrDFkYKaONeXCEckpQkRYE0S34PTLSj1bih1r 9gfRDzeWwrseHmJrEFTlmt1wtcRUsSiR+ofqSSj8NHfGml5CrUN1OzwZfuyR1dtMtXUM jCaN+ekesTC5Uw9Ow2Ql9I+xRu5Hnb8bWwGvY= Received: by 10.110.84.2 with SMTP id h2mr8533126tib.54.1222944090767; Thu, 02 Oct 2008 03:41:30 -0700 (PDT) Received: by 10.110.28.14 with HTTP; Thu, 2 Oct 2008 03:41:30 -0700 (PDT) Message-ID: <227621ad0810020341ye5b971bh4f141375c6a4c352@mail.gmail.com> Date: Thu, 2 Oct 2008 16:11:30 +0530 From: "Sridhar Raman" To: users@jackrabbit.apache.org Subject: Re: Contrasting performances of skip on node iterator In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_30030_10992779.1222944090747" References: <227621ad0810020317x32b8340clfead1560072e5b95@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_30030_10992779.1222944090747 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Ok, so is there a way in which we can possible just get the UUIDs of the children nodes, and then skip the required amount in this iterator, and then retrieve the actual nodes? I am asking this, I use the first skip to enable pagination, and it's very slow. On Thu, Oct 2, 2008 at 3:56 PM, Alexander Klimetschek wrote: > On Thu, Oct 2, 2008 at 12:17 PM, Sridhar Raman > wrote: > > I was testing a repository where a particular node has 10000 child nodes. > > If I get an iterator over these child nodes, and on this iterator, if I > call > > a skip, the performance is very slow (almost 3 seconds for 9000 nodes). > On > > the other hand, if I were to run an XPATH query that returns the exact > same > > 10000 nodes, and if I call on skip on this iterator, the skip is almost > > instantaneous. > > > > When I looked deeper, I noticed the iterator in the first case is a > > LazyItemIterator, while in the second case, it is a > > QueryResultImpl$LazyScoreNodeIterator. Is that the only reason for the > > difference in performance? > > From the top of my head: > > In the first case, the iterator looks at the real node data where it > has to deserialize the node bundle (which contains the links to the > child nodes) - there is simply no index involved here (that's the > reason for the current limitation of not using too many child nodes). > > In the second case, the query index is used for the list of nodes, > which is faster. > > Simply merging both solutions is not an easy option, since the query > manager is optional - if you turn of the search index configuration, > there won't be any index at all. > > Regards, > Alex > > -- > Alexander Klimetschek > alexander.klimetschek@day.com > ------=_Part_30030_10992779.1222944090747--