lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Wellnhofer <wellnho...@aevum.de>
Subject Re: [lucy-dev] Custom Backend for Index : Equivalent of DirectoryFactory.java (Lucene)
Date Thu, 15 Mar 2018 18:21:29 GMT
On 15/03/2018 14:51, bhardwajrajesh1973@gmail.com wrote:
> I went through this link -
> http://blog.mikemccandless.com/2017/09/lucenes-near-real-time-segment-index.html

Lucy doesn't support any of Lucene's replication features.

> I was thinking of implementing ; can you suggest what will be best method of implementing
above methodlogy

You could start by simply copying the index directory from the master to the 
slaves while locking out access to the index on both master and slaves. Lucy's 
index files never change, so you can use something equivalent to `rsync 
--ignore-existing`.

Here's an overview of the directory layout:

     http://lucy.apache.org/docs/c/Lucy/Docs/FileFormat.html

Ignoring any lock files, the list of files is:

- snapshot_*.json
- schema_*.json
- seg_*/segmeta.json
- seg_*/cfmeta.json
- seg_*/cf.dat

If you want to support concurrent searching on the slaves, things get more 
complicated. You should:

- Derive the list of segments to be copied from the latest snapshot
   file.
- First copy the new schema and segment files.
- Copy the snapshot file at the end and make sure that it's updated
   atomically.

If there are concurrent updates on the master, it can happen that files are 
deleted after reading the snapshot file. So you should make sure that there 
are no indexing sessions running during the file transfer or acquire Lucy's 
deletion lock.

Afterwards you can delete old segments, either by consulting the file list or 
by periodically creating an Indexer on the slaves and immediately destroying it.

Nick

Mime
View raw message