lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sridhar Raman" <sridhar.ra...@gmail.com>
Subject Re: Swapping between indexes
Date Fri, 14 Mar 2008 08:10:07 GMT
One quick doubt regarding copying of indexes.  Is the copy done on the
indexes in memory as well, or is it only done on the committed indexes?

On Fri, Mar 7, 2008 at 12:29 AM, Peter Keegan <peterlkeegan@gmail.com>
wrote:

> Sridhar,
>
> We have been using approach 2 in our production system with good results.
> We
> have separate processes for indexing and searching. The main issue that
> came
> up was in deleting old indexes (see: *http://tinyurl.com/32q8c4*). Most of
> our production problems occur during indexing, and we are able to fix
> these
> without having to interrupt searching at all. This has been a real
> benefit.
>
> Peter
>
>
> On Thu, Mar 6, 2008 at 5:30 AM, Sridhar Raman <sridhar.raman@gmail.com>
> wrote:
>
> > This is my situation.  I have an index, which has a lot of search
> requests
> > coming into it.  I use just a single instance of IndexSearcher to
> process
> > these requests.  At the same time, this index is also getting updated by
> > an
> > IndexWriter.  And I want these new changes to be reflected _only_ at
> > certain
> > intervals.  I have thought of a few ways of doing this.  Each has its
> > share
> > of problems and pluses.  I would be glad if someone can help me in
> > figuring
> > out the right approach, especially from the performance point of view,
> as
> > the number of documents that will get indexed are pretty large.
> >
> > Approach 1:
> > Have just one copy of the index for both Search & Index.  At time T,
> when
> > I
> > need to see the new changes reflected, I close the Searcher, and open it
> > again.
> > - The re-open of the Searcher might be a bit slow (which I could
> probably
> > solve by using some warm-up threads).
> > - Update and Search on the index at the same - will this affect the
> > performance?
> > - If server crashes before time T, the new Searcher would reflect the
> > changes, which is not acceptable.  I want the changes to be reflected
> only
> > at time T.  If server crashes, the index should be the previous T-1
> index.
> > - Possible problems while optimising the index (as Search is also
> > happening).
> > + Just one copy of the index being stored.
> >
> > Approach 2:
> > Keep 2 copies of the index - 1 for Search, 1 for Index.  At time T, I
> just
> > switch the Searcher to a copy of index that is being updated.
> > - Before I do the switch to the new index, I need to make a copy of it
> so
> > that the updates continue to happen on the other index.  Is there a
> > convenient way to make this copy?  Is it efficient?
> > - Time taken to create a new Searcher will still be a problem (but this
> is
> > a
> > problem in the previous approach as well, and we can live with it).
> > + Optimise can happen on an index that is not being read, as a result,
> its
> > resource requirements would be lesser.  And probably even the speed of
> > optimisation.
> > + Faster search as the index update is happening on a different index.
> >
> > So, these are the 2 approaches I am contemplating about.  Any pointers
> > which
> > would be the better approach?
> >
> > Thanks,
> > Sridhar
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message