jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Kiehl <christ...@sulu3000.de>
Subject Re: Does jackrabbit suffer from n+1 SQL SELECT queries?
Date Tue, 14 Aug 2007 20:33:52 GMT
Brian Thompson wrote:

> Would it be viable to add an option to retrieve everything in a given
> NodeIterator as a single DB query when a DB persistence manager is used?  It
> seems to me that such an option would help out a lot for cases like
> retrieving search results, getting all children of a node, etc.

Sounds like an interesting addition. Maybe we could extend the persistence 
manager interface with some kind of bulk read method which takes a list of uuids 
as an argument. For DB PMs this will result in queries like "select ... from ... 
where uuid=uuid1 or uuid=uuid2 ..." or "select ... from ... where uuid in 
(uuid1, uuid2)". But those queries should still be faster then successive 
queries for each uuid.
The question are:

1) How much faster is this in reality. It probably depends a lot on the kind of 
database you use, whether it is embedded, on the same machine or even remote.
2) In which situations can this bulk read methods be of any advantage? Since 
everything in Jackrabbit is build around loading nodes node by node only node 
iterators come to my mind right now. But there you have to decide how many nodes 
do you load? You never know how many nodes will be actually requested in the 
end. When querying you could probably give quite a good guess if you are using 
setLimit() but that's about it. For general node iterators you will probably 
fetch the nodes in batches of 10 or so. Not sure if that really gains you a lot.

Just a few thoughts ;)


View raw message