lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Updating the index and searching
Date Wed, 07 Sep 2005 18:52:21 GMT

The best advice I can give on this topic is don't open a new IndexReader
for every search.  There is a lot of caching that goes on under the
covers (particularly when you sort by things other then SCORE) which is
completely wasted if you open a new IndexReader everytime.  If you use
Filters, then you should almost allways use the CachingWrapperFilter,
which is also wasted if you open a new IndexReader everytime.

At a minimum, use a singleton IndexReader for your whol applicaiton, and
don't replace it with a new IndexReader unless
IndexReader.getCurrentVersion says it has changed since the last time you
opened it.

Ideally, don't check IndexReader.getCurrentVersion on every search, check
it only if the last time it was checked was more then N seconds ago --
where N can be defined by your needs, but make it as large as possible.
Even if N is very small, and you are making constant updates to your
index, at least this way concurrent searches can take advantage of
eachother.


: Date: Wed, 7 Sep 2005 09:44:05 +0100
: From: Paul.Illingworth@saaconsultants.com
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: Updating the index and searching
:
:
:
:
:
: Hello,
:
: I have an index into which documents get added and updated (by deleting and
: adding). When I run queries on the index these have to take into account
: all changes on the index so I open a new IndexReader. What I am finding is
: that when the index is large the opening of the index takes a considerable
: time with the effect that queries appear to take a long time to run. If a
: new document is added or an existing document is updated then I need make
: the changes and then open a new IndexReader so that I am able to see the
: changes.
:
: Does anybody else have a similar problem? Any solutions?
:
: One idea I have been thinking about that may get around the problem is to
: have two indexes, one which is only ever open for reading, the other used
: for reading/writing. The idea is that new documents only get added to the
: read/write index whilst queries and deletes would operate on both indexes.
: Periodically (overnight or when the read/write index reaches a certain
: size) then the read/write index will be merged into the read only index and
: a new empty read/write index created. The thinking behind this is the
: read/write index will be kept small and so constantly opening and closing
: will be relatively quick whilst the read only index will be able to be kept
: open.
:
: One worry I have is that keeping the read only index open for long periods
: without closing it means that the list of deleted documents will be kept in
: memory until the index is closed. If something happens to the server
: hosting the index then these deletes will be lost. Is there any way to
: force the deletes to be flushed to the disk without closing and reopening
: the index?
:
: Any help/ideas would be greatly appreciated.
:
: Regards
:
: Paul I.
:
: P.S. I know this subject has been touched upon before and I have had a look
: through the mailing lists but couldn't find anything. It would be nice if
: the mailing lists had a search facility...
:
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: For additional commands, e-mail: java-user-help@lucene.apache.org
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message