lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sriram Sankar <san...@gmail.com>
Subject Performance measurements
Date Wed, 24 Jul 2013 16:11:07 GMT
I did some performance tests on a real index using a query having the
following pattern:

termA AND (termB1 OR termB2 OR ... OR termBn)

The results were not good and I was wondering if I may be doing something
wrong (and what I would need to do to improve performance), or is it just
that the OR is very inefficient.

The format for the data below is illustrated below by example:

5|10
time: 0.092728962; scored: 18

Here, n=5, and we measure performance for retrieval of 10 results which is
0.0927ms. Had we not early terminated, we would have obtained 18 results.

As you will see in the data below, the performance for n=0 is very good,
but goes down drastically as n is increased.

Sriram.


0|10
time: 0.007941587; scored: 10887

0|1000
time: 0.018967384; scored: 10887

0|5000
time: 0.061943552; scored: 10887

0|10000
time: 0.115327001; scored: 10887

1|10
time: 0.053950965; scored: 0

5|20
time: 0.274681853; scored: 18

10|10
time: 0.14251254; scored: 22

10|20
time: 0.282503313; scored: 22

20|10
time: 0.251964067; scored: 32

20|30
time: 0.52860957; scored: 32

50|10
time: 0.888969702; scored: 57

50|30
time: 1.078579956; scored: 57

50|50
time: 1.601169195; scored: 57

100|10
time: 1.396391061; scored: 79

100|40
time: 1.8083494; scored: 79

100|80
time: 2.921094513; scored: 79

200|10
time: 2.848105701; scored: 119

200|50
time: 3.472198462; scored: 119

200|100
time: 4.722673648; scored: 119

400|10
time: 4.463727049; scored: 235

400|100
time: 6.554119665; scored: 235

400|200
time: 9.591892527; scored: 235

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message