lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sridhar Raman" <sridhar.ra...@gmail.com>
Subject Re: Swapping between indexes
Date Thu, 06 Mar 2008 13:02:09 GMT
> This way no reader will ever see the changes until you successfully
> close the writer.  If the machine crashes the index is still in the
> starting state as of when the writer was first opened.
Ok, I have a slight doubt in this.  Say I have gone ahead with Approach 1
If I have opened the writer with autoCommit=false, and the system crashes,
does it mean that the changes made to IdxSrch are lost?  If that is the
case, that might be a problem.  What I actually want is something like
this.  When the system crashes in between, the search continues to happen on
the index at T0.  But the updates that were done since T0 also needs to be
preserved.  Would that happen if I set autoCommit to false?

I realise that I want the cake and eat it too.  But that's the problem we
face if we keep just a single copy of the index.

On Thu, Mar 6, 2008 at 4:58 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

>
> A simple variant on Approach 1 would be to open your writer with
> autoCommit=false.
>
> This way no reader will ever see the changes until you successfully
> close the writer.  If the machine crashes the index is still in the
> starting state as of when the writer was first opened.
>
> Also, re-open of Approach 1 should be a bit (not a lot, though there
> is work to make it a lot) faster than wholly new open required in
> approach 2.
>
> There should not be problems optimizing while searching.  Yes, you
> use more disk space, but no more (in fact, less) than approach 2
> requires.
>
> I think approach 2 is only possibly better if the indexing would be
> done on a different computer / IO system.
>
> Mike
>
> Sridhar Raman wrote:
>
> > This is my situation.  I have an index, which has a lot of search
> > requests
> > coming into it.  I use just a single instance of IndexSearcher to
> > process
> > these requests.  At the same time, this index is also getting
> > updated by an
> > IndexWriter.  And I want these new changes to be reflected _only_
> > at certain
> > intervals.  I have thought of a few ways of doing this.  Each has
> > its share
> > of problems and pluses.  I would be glad if someone can help me in
> > figuring
> > out the right approach, especially from the performance point of
> > view, as
> > the number of documents that will get indexed are pretty large.
> >
> > Approach 1:
> > Have just one copy of the index for both Search & Index.  At time
> > T, when I
> > need to see the new changes reflected, I close the Searcher, and
> > open it
> > again.
> > - The re-open of the Searcher might be a bit slow (which I could
> > probably
> > solve by using some warm-up threads).
> > - Update and Search on the index at the same - will this affect the
> > performance?
> > - If server crashes before time T, the new Searcher would reflect the
> > changes, which is not acceptable.  I want the changes to be
> > reflected only
> > at time T.  If server crashes, the index should be the previous T-1
> > index.
> > - Possible problems while optimising the index (as Search is also
> > happening).
> > + Just one copy of the index being stored.
> >
> > Approach 2:
> > Keep 2 copies of the index - 1 for Search, 1 for Index.  At time T,
> > I just
> > switch the Searcher to a copy of index that is being updated.
> > - Before I do the switch to the new index, I need to make a copy of
> > it so
> > that the updates continue to happen on the other index.  Is there a
> > convenient way to make this copy?  Is it efficient?
> > - Time taken to create a new Searcher will still be a problem (but
> > this is a
> > problem in the previous approach as well, and we can live with it).
> > + Optimise can happen on an index that is not being read, as a
> > result, its
> > resource requirements would be lesser.  And probably even the speed of
> > optimisation.
> > + Faster search as the index update is happening on a different index.
> >
> > So, these are the 2 approaches I am contemplating about.  Any
> > pointers which
> > would be the better approach?
> >
> > Thanks,
> > Sridhar
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message