lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jie Yang <jyang_w...@yahoo.co.uk>
Subject Poor Performance when searching for 500+ terms
Date Wed, 12 Nov 2003 21:11:36 GMT
I know this is rare, But I am building an application
that submits searches having 500+ search terms. A
general example would be 

field1:w1 OR field1:w2 OR ... OR field1:w500

For 1 millions documents, the performance is OK if
field1 in each document has less than 50 terms, I can
get result < 1 sec. but if field1 has more than
average 400 terms in each document, the performance
degrades to around 6 secs.

Is there anyway to improve this? 

And my second questions is that my query often comes
with an AND condition with another search word. for
example:

field2:w AND (field1:w1 OR field1:w2, ... field1:w500)

field2:w will only return less than 1000 records out
of 1 millions. then I thought I could use a
StringFilter Object? i.e. search on field2.w first,
thus limit the search for 500 OR only on the field2.w
1000 results. somewhat like a join in database. But I
checked the code and sees that IndexSearcher always
perfomance the 500 disk searches before calling the
filter object? Any suggestions on this?

Also does lucene caches results in memory? I see the
performance tends to get better after a few runs,
especailly on searches on fields having small number
of terms. If so, can I manipulate the cache size
somehow to accommdate fields with large number of
terms. 

Many thanks.


________________________________________________________________________
Want to chat instantly with your online friends?  Get the FREE Yahoo!
Messenger http://mail.messenger.yahoo.co.uk

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message