jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sridhar Raman" <sridhar.ra...@gmail.com>
Subject Re: Contrasting performances of skip on node iterator
Date Thu, 02 Oct 2008 11:52:17 GMT
Here's another curious thing that I noticed.  In the first case, I was using
node.getNodes() to get the iterator.  I tried node.getNodes("*"), and I
notice that the skip method now is almost instantaneous.  Is this expected?
Do I go with this approach?

On Thu, Oct 2, 2008 at 4:11 PM, Sridhar Raman <sridhar.raman@gmail.com>wrote:

> Ok, so is there a way in which we can possible just get the UUIDs of the
> children nodes, and then skip the required amount in this iterator, and then
> retrieve the actual nodes?  I am asking this, I use the first skip to enable
> pagination, and it's very slow.
> On Thu, Oct 2, 2008 at 3:56 PM, Alexander Klimetschek <aklimets@day.com>wrote:
>> On Thu, Oct 2, 2008 at 12:17 PM, Sridhar Raman <sridhar.raman@gmail.com>
>> wrote:
>> > I was testing a repository where a particular node has 10000 child
>> nodes.
>> > If I get an iterator over these child nodes, and on this iterator, if I
>> call
>> > a skip, the performance is very slow (almost 3 seconds for 9000 nodes).
>>  On
>> > the other hand, if I were to run an XPATH query that returns the exact
>> same
>> > 10000 nodes, and if I call on skip on this iterator, the skip is almost
>> > instantaneous.
>> >
>> > When I looked deeper, I noticed the iterator in the first case is a
>> > LazyItemIterator, while in the second case, it is a
>> > QueryResultImpl$LazyScoreNodeIterator.  Is that the only reason for the
>> > difference in performance?
>> From the top of my head:
>> In the first case, the iterator looks at the real node data where it
>> has to deserialize the node bundle (which contains the links to the
>> child nodes) - there is simply no index involved here (that's the
>> reason for the current limitation of not using too many child nodes).
>> In the second case, the query index is used for the list of nodes,
>> which is faster.
>> Simply merging both solutions is not an easy option, since the query
>> manager is optional - if you turn of the search index configuration,
>> there won't be any index at all.
>> Regards,
>> Alex
>> --
>> Alexander Klimetschek
>> alexander.klimetschek@day.com

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message