lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "hui" <...@triplehop.com>
Subject RE: Sys properties Was: java.io.tmpdir as lock dir .... once again
Date Tue, 09 Mar 2004 15:51:45 GMT
Hi,
I did the testing on searching/indexing larger documents for the two index
formats. The indexing application indices the 66k documents incrementally up
to 10 times. At the same time, the search program wakes every 20 minutes to
open the index and do the search.

For the indexing, when indexing on the 9th 66k document, the compound index
case need 500 seconds more but it could be the machine is running other
schedule jobs. (I run the non-compound index case immediately after compound
index case; the first 66k for non-compound index also needs 300-400 seconds
more). So the index speed difference is within 10%.
There is no too much difference on the search itself. But for the compound
index case, the opening index reaches around 1 second twice. I saw this
happened for non-compound index format before, but it looks like compound
index format has higher frequency on this issue.


Regards,
Hui

Indexing time:
non-compound index
2:1709393 total milliseconds index total docs:66100
3:1322806 total milliseconds index total docs:66100
4:1387652 total milliseconds index total docs:66100
5:1371401 total milliseconds index total docs:66100
6:1378230 total milliseconds index total docs:66100
7:1375276 total milliseconds index total docs:66100
8:1411543 total milliseconds index total docs:66100
9:1533842 total milliseconds index total docs:66100
10:1722148 total milliseconds index total docs:66100
11:1893729 total milliseconds index total docs:66100

compound index
2:1437950 total milliseconds index total docs:66100
3:1406980 total milliseconds index total docs:66100
4:1427058 total milliseconds index total docs:66100
5:1462169 total milliseconds index total docs:66100
6:1461356 total milliseconds index total docs:66100
7:1657923 total milliseconds index total docs:66100
8:1901755 total milliseconds index total docs:66100
9:2041649 total milliseconds index total docs:66100


search time:

non-compound index

0 open index time:235
5551 total within(ms) 187
7326 total within(ms) 31
8698 total within(ms) 47
4179 total within(ms) 16
198 total within(ms) 47
1 open index time:63
8242 total within(ms) 15
10918 total within(ms) 0
13022 total within(ms) 16
6138 total within(ms) 15
332 total within(ms) 15
2 open index time:47
12194 total within(ms) 15
16210 total within(ms) 0
19305 total within(ms) 125
9099 total within(ms) 15
485 total within(ms) 31
3 open index time:47
16461 total within(ms) 16
21924 total within(ms) 0
26107 total within(ms) 31
12278 total within(ms) 15
655 total within(ms) 16
4 open index time:47
19857 total within(ms) 16
26560 total within(ms) 15
31583 total within(ms) 31
14834 total within(ms) 15
784 total within(ms) 15
5 open index time:78
23357 total within(ms) 16
31234 total within(ms) 16
37169 total within(ms) 31
17422 total within(ms) 32
895 total within(ms) 47
6 open index time:47
27208 total within(ms) 78
36371 total within(ms) 31
43276 total within(ms) 32
20303 total within(ms) 31
1054 total within(ms) 31
7 open index time:31
30073 total within(ms) 16
40270 total within(ms) 15
47977 total within(ms) 31
22366 total within(ms) 32
1202 total within(ms) 31
8 open index time:63
34133 total within(ms) 31
45709 total within(ms) 32
54424 total within(ms) 31
25418 total within(ms) 31
1354 total within(ms) 62
9 open index time:62
37845 total within(ms) 47
50582 total within(ms) 16
60236 total within(ms) 47
28191 total within(ms) 31
1505 total within(ms) 47
10 open index time:46
41416 total within(ms) 32
55456 total within(ms) 31
65980 total within(ms) 109
30892 total within(ms) 47
1641 total within(ms) 47
11 open index time:31
44121 total within(ms) 78
58992 total within(ms) 31
70242 total within(ms) 62
32871 total within(ms) 31
1729 total within(ms) 47
12 open index time:47
46718 total within(ms) 31
62473 total within(ms) 15
74415 total within(ms) 47
34776 total within(ms) 31
1862 total within(ms) 47
13 open index time:0
47025 total within(ms) 32
63008 total within(ms) 31
75031 total within(ms) 47
35002 total within(ms) 31
1870 total within(ms) 31
14 open index time:0
47025 total within(ms) 31
63008 total within(ms) 31
75031 total within(ms) 47
35002 total within(ms) 31
1870 total within(ms) 32

compound index

0 open index time:203
7876 total within(ms) 125
10326 total within(ms) 0
12317 total within(ms) 32
5885 total within(ms) 16
315 total within(ms) 31
1 open index time:62
10562 total within(ms) 16
14127 total within(ms) 0
16793 total within(ms) 16
7896 total within(ms) 15
385 total within(ms) 15
2 open index time:1016
14466 total within(ms) 15
19304 total within(ms) 0
22944 total within(ms) 31
10826 total within(ms) 15
546 total within(ms) 16
3 open index time:15
17123 total within(ms) 16
22947 total within(ms) 15
27326 total within(ms) 15
12744 total within(ms) 125
680 total within(ms) 31
4 open index time:63
21148 total within(ms) 15
28245 total within(ms) 15
33659 total within(ms) 31
15734 total within(ms) 31
842 total within(ms) 63
5 open index time:47
25019 total within(ms) 16
33398 total within(ms) 16
39772 total within(ms) 32
18645 total within(ms) 32
995 total within(ms) 47
6 open index time:47
28367 total within(ms) 15
37970 total within(ms) 31
45169 total within(ms) 46
21168 total within(ms) 31
1112 total within(ms) 31
7 open index time:31
31424 total within(ms) 31
42002 total within(ms) 16
49994 total within(ms) 31
23432 total within(ms) 32
1223 total within(ms) 47
8 open index time:46
33895 total within(ms) 32
45292 total within(ms) 47
53957 total within(ms) 47
25230 total within(ms) 32
1352 total within(ms) 47
9 open index time:63
37320 total within(ms) 31
49922 total within(ms) 15
59412 total within(ms) 47
27830 total within(ms) 31
1474 total within(ms) 62
10 open index time:984
38475 total within(ms) 16
51552 total within(ms) 16
61389 total within(ms) 47
28638 total within(ms) 46
1530 total within(ms) 157

-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org] 
Sent: Monday, March 08, 2004 1:16 PM
To: Lucene Users List
Subject: Re: Sys properties Was: java.io.tmpdir as lock dir .... once again

hui wrote:
> Index time: 
> compound format is 89 seconds slower.
> 
> compound format:
> 1389507 total milliseconds
> non-compound format:
> 1300534 total milliseconds
> 
> The index size is 85m with 4 fields only. The files are stored in the
index.
> The compound format has only 3 files and the other has 13 files. 

Thanks for performing this benchmark!

It looks like the compound format is around 7% slower when indexing.  To 
my thinking that's acceptable, given the dramatic reduction in file 
handles.  If folks really need maximal indexing performance, then they 
can explicitly disable the compound format.

Would anyone object to making compound format the default for Lucene 
1.4?  This is an incompatible change, but I don't think it should break 
applications.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message