Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C8D03FA8B for ; Thu, 4 Apr 2013 15:25:21 +0000 (UTC) Received: (qmail 48262 invoked by uid 500); 4 Apr 2013 15:25:18 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 48045 invoked by uid 500); 4 Apr 2013 15:25:18 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 47140 invoked by uid 99); 4 Apr 2013 15:25:17 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Apr 2013 15:25:17 +0000 Date: Thu, 4 Apr 2013 15:25:17 +0000 (UTC) From: "Michael McCandless (JIRA)" To: dev@lucene.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (LUCENE-4738) Killed JVM when first commit was running will generate a corrupted index MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-4738: --------------------------------------- Attachment: LUCENE-4738.patch Patch, with test and fix. The problem here was IndexFileDeleter was attempting to load the initial commit point even though IndexWriter already detected that there was no valid segments file. I just fixed IndexWriter to record this, and pass a boolean telling IFD whehter it should open the initial commit. However, if you try to run CheckIndex, or open an IndexReader, on an index in this state (corrupt initial commit) they will both fail, since there is in fact no valid index. > Killed JVM when first commit was running will generate a corrupted index > ------------------------------------------------------------------------ > > Key: LUCENE-4738 > URL: https://issues.apache.org/jira/browse/LUCENE-4738 > Project: Lucene - Core > Issue Type: Bug > Components: core/index > Affects Versions: 4.0 > Environment: OS: Linux 2.6.32-220.23.1.el6.x86_64 > Java: java version "1.7.0_05" > Lucene: lucene-core-4.0.0 > Reporter: Billow Gao > Attachments: LUCENE-4738.patch, LUCENE-4738_test.patch > > > 1. Start a NEW IndexWriterBuilder on an empty folder, > add some documents to the index > 2. Call commit > 3. When the segments_1 file with 0 byte was created, kill the JVM > We will end with a corrupted index with an empty segments_1. > We only have issue with the first commit crash. > Also, if you tried to open an IndexSearcher on a new index. And the first commit on the index was not finished yet. Then you will see exception like: > =========================================================================== > org.apache.lucene.index.IndexNotFoundException: no segments* file found in org.apache.lucene.store.MMapDirectory@C:\tmp\testdir lockFactory=org.apache.lucene.store.NativeFSLockFactory@6ee00df: files: [write.lock, _0.fdt, _0.fdx] > at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:741) > at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52) > at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:65) > =========================================================================== > So when a new index was created, we should first create an empty index. We should not wait for the commit/close call to create the segment file. > If we had an empty index there. It won't leave a corrupted index when there were a power issue on the first commit. > And a concurrent IndexSearcher can access to the index(No match is better than exception). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org