jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Seidel. Robert" <Robert.Sei...@aeb.de>
Subject SQL2 can't be used for large amounts of hits
Date Wed, 04 May 2011 15:37:34 GMT

I've done some testing with Jackrabbit 2.2.1, SQL2 and nearly 100k nodes and an Oracle bundled
data store. A query resulting all 100k nodes took about 10 minutes to execute (not to iterate
about the result).
First I thought it was because the sort with the "order by" expression I have used, but I've
removed the order by and the query is still slow.

I've looked a litte bit into the code and found LuceneQueryFactory public List<Row>
execute(...). There is something like:

Node = hits.nextScoreNode();
While (node != null) {
  ... session.getNodeById(Node.getNodeId()
  ...node = hits.nextScoreNode();

The funny thing is, although everything is read into an collection, you can't ask the query
how many hits there were, because the collection was put into an iterator afterwards...

For the GUI this is ok for me, because I can only display a defined number of hits, like 100
or so - so I can use the setLimit() (if I do so, the sort is broken - because it sorts only
the limited results...).

But I have some other use case, where I want to iterate over an huge amount of nodes and export
the results. In this case, the collection from LuceneQueryFactory will simply not fit into
the memory.

Is there any solution for this, like a real iterator?

Kindly regards, Robert

Logistik und Au?enwirtschaft stehen auf der transport logistic vom 10.-13. Mai 2011 in M?nchen
im Mittelpunkt.
Ihre Fragen am Stand von AEB. In Halle B2, Stand 405/506.
Hier k?nnen Sie einen Termin vereinbaren und einen Gewinncode generieren: www.aeb.de/transport-logistic.
Mit etwas Gl?ck gewinnen Sie vor Ort ein Apple iPad.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message