lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mopster <mops...@gmail.com>
Subject Speed of complex boolean searches on large indexes
Date Fri, 09 Sep 2005 15:05:45 GMT
Hi, 

I am testing the speed of searching Lucene indexes. The index is of
the larger size! It has about 500,000 documents, about 60 fields with
1 field (Field1) containing  the body of the document. Total index
size is currently about 20Gb

Testing the search i get this behaviour

(Field2:1) AND (Field10:50000) AND ( Field10010:null OR Field10010:14
OR Field10010:G2 NOT Field10009:14  AND (Field10000:0 OR Field10000:1
OR Field10000:2 OR Field10000:3) AND ((Field10005:null AND
Field10006:null AND Field10007:null   Field10008:null ) OR
(Field10005:2 OR Field10006:2 OR  Field10007:14 OR Field10008:14)))

took 13 secs (don't worry about the high field values. Started at
10,000. Null is just  a search tag entered if nothing is in the field)

so took out the NOT

(Field2:1) AND (Field10:50000) AND ( Field10010:1 OR Field10010:14 OR
Field10010:G2 AND Field10009:14  AND (Field10000:0 OR Field10000:1 OR
Field10000:2 OR Field10000:3) AND ((Field10005:1 AND Field10006:1 AND
Field10007:1   Field10008:1 ) OR (Field10005:2 OR Field10006:2 OR 
Field10007:14 OR Field10008:14)))

took 9 secs

so took out the OR

(Field2:1) AND (Field10:50000) AND ( Field10010:1 AND Field10010:14
AND Field10010:G2 AND Field10009:14  AND (Field10000:0 AND
Field10000:1 AND Field10000:2 AND Field10000:3) AND ((Field10005:1 AND
Field10006:1 AND Field10007:1   Field10008:1 ) AND (Field10005:2 AND
Field10006:2 AND  Field10007:14 AND Field10008:14)))

took 4 secs 

so took out the extra ()

Field2:1 AND Field10:50000 AND  Field10010:1 AND Field10010:14 AND
Field10010:G2 AND Field10009:14  AND Field10000:0 AND Field10000:1 AND
Field10000:2 AND Field10000:3 AND Field10005:1 AND Field10006:1 AND
Field10007:1   Field10008:1  AND Field10005:2 AND Field10006:2 AND 
Field10007:14 AND Field10008:14

took 1 second

Has anyone got any thoughts on this? Do i need to the search
differently? Should I not have indexes this large. Maybe smaller ones
and combine the results?

Has anyone else had this type of issue?

Regards,

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message