jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chetan Mehrotra <chetan.mehro...@gmail.com>
Subject Re: Slow full text query performance and Lucene Index handling in Oak
Date Wed, 09 Apr 2014 11:24:32 GMT
On Wed, Apr 9, 2014 at 3:00 PM, Alex Parvulescu
<alex.parvulescu@gmail.com> wrote:
>  - the patch assumes that there is and will be a single lucene index
> directly under the root node, which may not necessarily be the case. I
> agree this assumption holds now, but I would not introduce any changes that
> take away this flexibility.

That is not a problem per se as IndexReader starts with a count of 1.
So it would never go zero

The problem appears to be somewhere else. As I modified the code to
use shared IndexSearcher and native FSDirectory and still the
performance improvement was marginal.

The problem is occuring because the
org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndex#query [1]
currently does a eager initialization of cursor while the testcase
only fetches the first result. Compared to this the JR2 version does a
lazy evaluation. If put a break in loop (exit after first result) the
results are much better

Oak-Tar(break.shared searcher,fs)  1       2       2       3       3
  170   23204
Oak-Tar(break)                     1       5       5       5       6
   90   10593
Jackrabbit                         1       4       4       5       6
  231   11385

Now I am not sure if this a problem with the usecase taken. Or the
Lucene Index cursor management should be improved as in many case the
results would be multiple but the client code only makes use of
initial few results

Chetan Mehrotra
[1] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndex.java#L381-L409

Mime
View raw message