lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Efficiently reopening remotely-distributed indexes in 2.9?
Date Thu, 08 Oct 2009 19:04:25 GMT
Nigel wrote:
> Thanks, Mark.  That makes sense.  I guess if you do it in the right order,
> you're guaranteed to have the files in a consistent state, since the only
> thing that's actually overwritten is the segments.gen file at the end.
>   
The main thing to do is to copy the segments_N files last - thats what a
Reader will use to see there is a new index version. The segments.gen
files is a backup resort that shouldn't likely be used unless your on
NFS from what I know.
> What about the technique of creating a copy of the directory with hard links
> and rsyncing changes into that copy?  Is that only necessary if you want to
> be using the old and updated versions of the index simultaneously?
>   
I think this was only necessary before IndexDeletionPolices - you didn't
want the IndexWriter removing the files before you were done copying
them out. You can manage that with a delete policy now though.
> Thanks,
> Chris
>
> On Wed, Oct 7, 2009 at 4:02 PM, Mark Miller <markrmiller@gmail.com> wrote:
>
>   
>> Solr just copies them into the same directory - Lucene files are write
>> once, so its not much different than what happens locally.
>>
>> Nigel wrote:
>>     
>>> Right now we logically re-open an index by making an updated copy of the
>>> index in a new directory (using rsync etc.), opening the new copy, and
>>> closing the old one.  We don't use IndexReader.reopen() because the
>>>       
>> updated
>>     
>>> index is in a different directory (as opposed to being updated in-place).
>>>
>>> (Reading about some of the 2.9 changes motivated me to look into actually
>>> using reopen().  And Michael Busch and Mark Miller both pointed out that
>>>       
>> I
>>     
>>> was incorrect in saying that pre-2.9 reopen() wasn't more efficient than
>>> just opening a new index -- I've read through that code now so I have at
>>> least a basic understanding of what's happening there.  Anyway, it seems
>>> like reopen() is a Good Thing, so I'd like to use it. (-:)
>>>
>>> So, my real question was whether there is a "recommended" way to update
>>>       
>> an
>>     
>>> index in-place with files copied from a separate indexing server.
>>>
>>> For example, do you simply rsync in the new cfs files, overwrite the
>>> segments.gen and segments_XX files, and call reopen()?  Or create an
>>>       
>> updated
>>     
>>> copy in a new directory, then rename new directory to the old name once
>>> you're sure you've copied everything successfully, then call reopen()?
>>>       
>>  What
>>     
>>> does Solr do?
>>>
>>> Thanks,
>>> Chris
>>>       
>
>   


-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message