lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nigel <nigelspl...@gmail.com>
Subject Re: Efficiently reopening remotely-distributed indexes in 2.9?
Date Fri, 09 Oct 2009 13:39:51 GMT
Got it -- thanks, Mark!  (Recently I read elsewhere in the archives of this
list about the value or lack thereof of segments.gen, so skipping that file
was in the back of my mind as well.)

Chris

On Thu, Oct 8, 2009 at 3:04 PM, Mark Miller <markrmiller@gmail.com> wrote:

> Nigel wrote:
> > Thanks, Mark.  That makes sense.  I guess if you do it in the right
> order,
> > you're guaranteed to have the files in a consistent state, since the only
> > thing that's actually overwritten is the segments.gen file at the end.
> >
> The main thing to do is to copy the segments_N files last - thats what a
> Reader will use to see there is a new index version. The segments.gen
> files is a backup resort that shouldn't likely be used unless your on
> NFS from what I know.
> > What about the technique of creating a copy of the directory with hard
> links
> > and rsyncing changes into that copy?  Is that only necessary if you want
> to
> > be using the old and updated versions of the index simultaneously?
> >
> I think this was only necessary before IndexDeletionPolices - you didn't
> want the IndexWriter removing the files before you were done copying
> them out. You can manage that with a delete policy now though.
> > Thanks,
> > Chris
> >
> > On Wed, Oct 7, 2009 at 4:02 PM, Mark Miller <markrmiller@gmail.com>
> wrote:
> >
> >
> >> Solr just copies them into the same directory - Lucene files are write
> >> once, so its not much different than what happens locally.
> >>
> >> Nigel wrote:
> >>
> >>> Right now we logically re-open an index by making an updated copy of
> the
> >>> index in a new directory (using rsync etc.), opening the new copy, and
> >>> closing the old one.  We don't use IndexReader.reopen() because the
> >>>
> >> updated
> >>
> >>> index is in a different directory (as opposed to being updated
> in-place).
> >>>
> >>> (Reading about some of the 2.9 changes motivated me to look into
> actually
> >>> using reopen().  And Michael Busch and Mark Miller both pointed out
> that
> >>>
> >> I
> >>
> >>> was incorrect in saying that pre-2.9 reopen() wasn't more efficient
> than
> >>> just opening a new index -- I've read through that code now so I have
> at
> >>> least a basic understanding of what's happening there.  Anyway, it
> seems
> >>> like reopen() is a Good Thing, so I'd like to use it. (-:)
> >>>
> >>> So, my real question was whether there is a "recommended" way to update
> >>>
> >> an
> >>
> >>> index in-place with files copied from a separate indexing server.
> >>>
> >>> For example, do you simply rsync in the new cfs files, overwrite the
> >>> segments.gen and segments_XX files, and call reopen()?  Or create an
> >>>
> >> updated
> >>
> >>> copy in a new directory, then rename new directory to the old name once
> >>> you're sure you've copied everything successfully, then call reopen()?
> >>>
> >>  What
> >>
> >>> does Solr do?
> >>>
> >>> Thanks,
> >>> Chris
> >>>
> >
> >
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message