Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm
Precedence: bulk
Reply-To: users@jackrabbit.apache.org
Received-SPF: pass (athena.apache.org: domain of sridhar.raman@gmail.com
 designates 209.85.142.189 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=message-id:date:from:to:subject:in-reply-to:mime-version
         :content-type:references;
        b=XUVNjZdqGP3lRcce91E+zfWWOAZhcCrDFkYKaONeXCEckpQkRYE0S34PTLSj1bih1r
         9gfRDzeWwrseHmJrEFTlmt1wtcRUsSiR+ofqSSj8NHfGml5CrUN1OzwZfuyR1dtMtXUM
         jCaN+ekesTC5Uw9Ow2Ql9I+xRu5Hnb8bWwGvY=
Message-ID: <227621ad0810020341ye5b971bh4f141375c6a4c352@mail.gmail.com>
Date: Thu, 2 Oct 2008 16:11:30 +0530
From: "Sridhar Raman" <sridhar.raman@gmail.com>
To: users@jackrabbit.apache.org
Subject: Re: Contrasting performances of skip on node iterator
In-Reply-To: <c3ac3bad0810020326q17774c44i82a5ee7f32e6f15d@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_Part_30030_10992779.1222944090747"
References: <227621ad0810020317x32b8340clfead1560072e5b95@mail.gmail.com>
	 <c3ac3bad0810020326q17774c44i82a5ee7f32e6f15d@mail.gmail.com>

------=_Part_30030_10992779.1222944090747
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Ok, so is there a way in which we can possible just get the UUIDs of the
children nodes, and then skip the required amount in this iterator, and then
retrieve the actual nodes?  I am asking this, I use the first skip to enable
pagination, and it's very slow.

On Thu, Oct 2, 2008 at 3:56 PM, Alexander Klimetschek <aklimets@day.com>wrote:

> On Thu, Oct 2, 2008 at 12:17 PM, Sridhar Raman <sridhar.raman@gmail.com>
> wrote:
> > I was testing a repository where a particular node has 10000 child nodes.
> > If I get an iterator over these child nodes, and on this iterator, if I
> call
> > a skip, the performance is very slow (almost 3 seconds for 9000 nodes).
>  On
> > the other hand, if I were to run an XPATH query that returns the exact
> same
> > 10000 nodes, and if I call on skip on this iterator, the skip is almost
> > instantaneous.
> >
> > When I looked deeper, I noticed the iterator in the first case is a
> > LazyItemIterator, while in the second case, it is a
> > QueryResultImpl$LazyScoreNodeIterator.  Is that the only reason for the
> > difference in performance?
>
> From the top of my head:
>
> In the first case, the iterator looks at the real node data where it
> has to deserialize the node bundle (which contains the links to the
> child nodes) - there is simply no index involved here (that's the
> reason for the current limitation of not using too many child nodes).
>
> In the second case, the query index is used for the list of nodes,
> which is faster.
>
> Simply merging both solutions is not an easy option, since the query
> manager is optional - if you turn of the search index configuration,
> there won't be any index at all.
>
> Regards,
> Alex
>
> --
> Alexander Klimetschek
> alexander.klimetschek@day.com
>

------=_Part_30030_10992779.1222944090747--