lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From moshe <mo...@egis-software.com>
Subject Performance Questions
Date Sat, 08 Sep 2007 10:41:07 GMT

I have a couple questions regarding performance of lucene. First off my
environment:

Data
1-10M Documents
5 - 30 fields < 10B
1-3 Fields 1KB - 500KB

I have three types of queries:

Query 1 : 85% usage 
1-2  phrase terms i.e. +id:"651" +id2:"241"
sorting by an arbitrary field normally the date
5-20 security terms
5k-1M results
can never return stale data

Query 2:  13%
10 full wildcard terms i.e. *search*
sorting is optional
0-200 results
20-200 security terms
can return slightly stale data

Query 3: 2%
1-20 mixed terms
sorting is optional
0-200 results
20-200 security terms
can return slightly stale data

1) Does re-opening an IndexSearcher flush all of the caches (filter and
sort) ? 

2) What is the overhead of opening an IndexSearcher? What does it depend on?

3) What is the recommended approach for updating and refreshing the index
where there is 1 update for every 5 queries? 

4) Is query 1 better off done using a database as I would have to re-open
the IndexSeacher every couple of queries?

5) What would perform better Solr or Lucence? When is it better to use one
or the other?

6) What else should I look out for?

7) Why is refreshing an IndexSearcher not supported? 


Any help is greatly appreciated 
Thanks
Moshe 


 

-- 
View this message in context: http://www.nabble.com/Performance-Questions-tf4405513.html#a12568500
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message