lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: Iterating Over All Documents On a Changing Index
Date Mon, 21 Oct 2019 20:58:22 GMT
This is the right place to ask these questions indeed.

This is a good way to iterate over documents. Regarding your 2nd
question, Lucene IndexReaders are point-in-time views of the data, so
changes won't become visible in-place. The tricky problem with this
kind of problem is usually to deal with documents that are getting
indexed after you pulled a new reader and while you are in the process
of reindexing.

On Sat, Oct 19, 2019 at 1:35 AM Matt Davis <kryptonics411@gmail.com> wrote:
>
> Hi All,
>
> I am working on implementing of an in place reindex using Lucene.  In my
> case, I have BSON document stored in a binary field and have a set of rules
> that pull fields out of the BSON and indexes them into different Lucene
> fields with different analyzers.  I would like to be able to change these
> rules / schema and then iterate over the documents, indexing them using the
> new schema.
>
> I have come up with the following code block:
> https://gist.github.com/mdavis95/f600e0a8233d0a1232eff77645d1dc8a
>
> I have two questions:
> 1) Is this a good way to iterate over the documents
> 2) How can I manage documents changing when I am doing this.  New documents
> coming in should be fine I believe but changes to existing documents could
> be lost if I understand correctly.
>
> I hope that this is the right place to ask this question and I apologize if
> this is obvious or has been asked and answered.
>
> Thanks,
> Matt



-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message