lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Live index upgrading
Date Mon, 17 Jun 2019 16:04:04 GMT
Let’s back up a bit. What version of Lucene are you using? Starting with Lucene 8, any index
that’s ever been touched by Lucene 6 will not open. It does not matter if the index has
been completely rewritten. It does not matter if it’s been run through IndexUpgraderTool,
which just does a forceMerge to 1 segment. A marker is preserved when a segment is created,
and the earliest one is preserved across merges. So say you have two segments, one created
with 6 and one with 7. The Lucene 6 marker is preserved when they are merged.

Now, if any segment has the Lucene 6 marker, the index will not be opened by Lucene.

If you’re using Lucene 7, then this error implies that one or more of your segments was
created with Lucene 5 or earlier.

So you probably need to re-index from scratch on whatever version of Lucene you want to use.

Best,
Erick



> On Jun 17, 2019, at 8:41 AM, David Allouche <david@allouche.net> wrote:
> 
> Hello,
> 
> I use Lucene with PyLucene on a public-facing web application. We have a moderately large
index (~24M documents, ~11GB index data), with a constant stream of new documents.
> 
> I recently upgraded to PyLucene 7.
> 
> When trying to test the new release of PyLucene 8, I encountered an IndexFormatTooOld
error because my index conversion from Lucene6 to Lucene7 was not complete.
> 
> I found IndexUpgrader, and I had a look at its implementation. I would very much like
to avoid putting down the service during the index upgrade, so I believe I cannot use IndexUpgrader
because I need the write lock to be held by the web application to index new documents.
> 
> So I figure I could get the desired result with an IndexWriter.forceMerge(1). But the
documentation says "This is a horribly costly operation, especially when you pass a small
maxNumSegments; usually you should only call this if the index is static (will no longer be
changed)." https://lucene.apache.org/core/7_7_2/core/org/apache/lucene/index/IndexWriter.html#forceMerge-int-
> 
> And indeed, forceMerge tends be killed the kernel OOM killer on my development VM. I
want to avoid this failure mode in production. I could increase the VM until it works, but
I would rather have a less brutal approach to upgrading a live index. Something that could
run in the background with reasonable amounts of anonymous memory.
> 
> What is the recommended approach to upgrading a live index?
> 
> How can I know from the code that the index needs upgrading at all? I could add a manual
knob to start an upgrade, but it would be better if it occurred transparently when I upgrade
PyLucene.
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message