lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] [Updated] (LUCENE-6287) Corrupt index (missing .si file) on first 4.x commit to a 3.x index
Date Tue, 24 Feb 2015 20:00:06 GMT


Michael McCandless updated LUCENE-6287:
    Attachment: LUCENE-6287.patch

Patch w/ a simple fix ... I'm beasting the test and so far so good ... I'll leave it running.

IW already holds an incRef'd set of files that are in-flight for commit, so I just fixed it
to re-compute that set after SIS.prepareCommit (which may write the .si/marker files) and
incRef the new set with IFD.  This protects them while the commit runs, and then when the
commit finishes we incRef them with IFD again and they are permanent after that.

> Corrupt index (missing .si file) on first 4.x commit to a 3.x index
> -------------------------------------------------------------------
>                 Key: LUCENE-6287
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Blocker
>             Fix For: 4.10.4
>         Attachments: LUCENE-6287.patch, LUCENE-6287.patch
> If you have a 3.x index, and you open it with a 4.x IndexWriter for
> the first time, and you do something that kicks of merges while
> concurrently committing, it's possible the index will corrupt itself
> with exceptions like this:
> {noformat}
> java.nio.file.NoSuchFileException: /l/tmp/reruns.TestBackwardsCompatibility3x.testMergeDuringUpgrade.t2/lucene.index.TestBackwardsCompatibility3x-71F31CCCEF6853A-001/manysegments.362-006/
> 	at sun.nio.fs.UnixException.translateToIOException(
> 	at sun.nio.fs.UnixException.rethrowAsIOException(
> 	at sun.nio.fs.UnixException.rethrowAsIOException(
> 	at sun.nio.fs.UnixFileSystemProvider.newFileChannel(
> 	at
> 	at
> 	at
> 	at
> 	at
> 	at
> 	at org.apache.lucene.index.SegmentInfos$1.doBody(
> 	at org.apache.lucene.index.SegmentInfos$
> 	at org.apache.lucene.index.SegmentInfos$
> 	at
> 	at org.apache.lucene.index.CheckIndex.checkIndex(
> 	at org.apache.lucene.util.TestUtil.checkIndex(
> 	at org.apache.lucene.util.TestUtil.checkIndex(
> 	at
> 	at org.apache.lucene.index.TestBackwardsCompatibility3x.testMergeDuringUpgrade(
> {noformat}
> Back compat tests in Elasticsearch hit this, and at first I thought maybe LUCENE-6279
was the cause (I still think we should fix that) but after further debugging there is a different
concurrency bug lurking here.
> I have a test case which after substantial beasting is able to reproduce the bug, but
I don't yet have a fix.  I think IW is missing a checkpoint after writing a new commit...

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message