jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Mueller <muel...@adobe.com>
Subject Re: The infamous getSize() == -1 (Was: [jira] [Created] (OAK-300) Query: QueryResult.getRows().getSize())
Date Thu, 13 Sep 2012 08:22:20 GMT

>return the correct size if the result set has fewer than
>something like 1000 entries. That should cover most practical cases

Yes. Let's discuss the value now! 1000 sounds OK in general, however there
is a potential performance problem. For Jackrabbit 2.x, if there are more
than a few million nodes in the repository, you can only load about 100
nodes per second (when using a regular hard disk). Now if we always try to
prefetch 1000 nodes (which is what you have to do in this case), then that
could result in getSize() to take 10 seconds. I think that's not
acceptable. Therefore I argue the value should be about 20, which means
worst case getSize() would take 0.2 seconds. Of course we can still change
the value later, but I would start with 20 (or even lower) just so we
detect problems with that early on.

>> If getSize(int max) is just in the Oak API but not
>> available for the end user, then we didn't really gain much :-)
>If we come up with a better API design in oak-core, that'll be a good
>candidate for inclusion in JCR 2.1.

Yes, that makes sense of course.


View raw message