Return-Path: Delivered-To: apmail-lucene-java-commits-archive@www.apache.org Received: (qmail 28439 invoked from network); 11 Feb 2008 18:57:08 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 11 Feb 2008 18:57:08 -0000 Received: (qmail 62851 invoked by uid 500); 11 Feb 2008 18:57:01 -0000 Delivered-To: apmail-lucene-java-commits-archive@lucene.apache.org Received: (qmail 62768 invoked by uid 500); 11 Feb 2008 18:57:01 -0000 Mailing-List: contact java-commits-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-commits@lucene.apache.org Received: (qmail 62757 invoked by uid 99); 11 Feb 2008 18:57:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Feb 2008 10:57:01 -0800 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO eris.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Feb 2008 18:56:09 +0000 Received: by eris.apache.org (Postfix, from userid 65534) id B792A1A983E; Mon, 11 Feb 2008 10:56:24 -0800 (PST) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r620576 [2/3] - in /lucene/java/trunk: ./ docs/ src/java/org/apache/lucene/index/ src/java/org/apache/lucene/store/ src/site/src/documentation/content/xdocs/ src/test/org/apache/lucene/index/ src/test/org/apache/lucene/store/ src/test/org/a... Date: Mon, 11 Feb 2008 18:56:13 -0000 To: java-commits@lucene.apache.org From: mikemccand@apache.org X-Mailer: svnmailer-1.0.8 Message-Id: <20080211185626.B792A1A983E@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Modified: lucene/java/trunk/src/java/org/apache/lucene/index/IndexWriter.java URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/IndexWriter.java?rev=620576&r1=620575&r2=620576&view=diff ============================================================================== --- lucene/java/trunk/src/java/org/apache/lucene/index/IndexWriter.java (original) +++ lucene/java/trunk/src/java/org/apache/lucene/index/IndexWriter.java Mon Feb 11 10:56:09 2008 @@ -27,11 +27,13 @@ import org.apache.lucene.store.AlreadyClosedException; import org.apache.lucene.util.BitVector; import org.apache.lucene.util.Parameter; +import org.apache.lucene.util.Constants; import java.io.File; import java.io.IOException; import java.io.PrintStream; import java.util.List; +import java.util.Collection; import java.util.ArrayList; import java.util.HashMap; import java.util.Set; @@ -83,33 +85,45 @@ for changing the {@link MergeScheduler}).

-

The optional autoCommit argument to the - constructors - controls visibility of the changes to {@link IndexReader} instances reading the same index. - When this is false, changes are not - visible until {@link #close()} is called. - Note that changes will still be flushed to the - {@link org.apache.lucene.store.Directory} as new files, - but are not committed (no new segments_N file - is written referencing the new files) until {@link #close} is - called. If something goes terribly wrong (for example the - JVM crashes) before {@link #close()}, then - the index will reflect none of the changes made (it will - remain in its starting state). - You can also call {@link #abort()}, which closes the writer without committing any - changes, and removes any index +

[Deprecated: Note that in 3.0, IndexWriter will + no longer accept autoCommit=true (it will be hardwired to + false). You can always call {@link IndexWriter#commit()} yourself + when needed]. The optional autoCommit argument to the constructors + controls visibility of the changes to {@link IndexReader} + instances reading the same index. When this is + false, changes are not visible until {@link + #close()} is called. Note that changes will still be + flushed to the {@link org.apache.lucene.store.Directory} + as new files, but are not committed (no new + segments_N file is written referencing the + new files, nor are the files sync'd to stable storage) + until {@link #commit} or {@link #close} is called. If something + goes terribly wrong (for example the JVM crashes), then + the index will reflect none of the changes made since the + last commit, or the starting state if commit was not called. + You can also call {@link #abort}, which closes the writer + without committing any changes, and removes any index files that had been flushed but are now unreferenced. This mode is useful for preventing readers from refreshing at a bad time (for example after you've done all your - deletes but before you've done your adds). - It can also be used to implement simple single-writer - transactional semantics ("all or none").

+ deletes but before you've done your adds). It can also be + used to implement simple single-writer transactional + semantics ("all or none").

When autoCommit is true then - every flush is also a commit ({@link IndexReader} - instances will see each flush as changes to the index). - This is the default, to match the behavior before 2.2. - When running in this mode, be careful not to refresh your + the writer will periodically commit on its own. This is + the default, to match the behavior before 2.2. However, + in 3.0, autoCommit will be hardwired to false. There is + no guarantee when exactly an auto commit will occur (it + used to be after every flush, but it is now after every + completed merge, as of 2.4). If you want to force a + commit, call {@link #commit}, or, close the writer. Once + a commit has finished, ({@link IndexReader} instances will + see the changes to the index as of that commit. When + running in this mode, be careful not to refresh your readers while optimize or segment merges are taking place as this can tie up substantial disk space.

@@ -250,7 +264,20 @@ * set (see {@link #setInfoStream}). */ public final static int MAX_TERM_LENGTH = DocumentsWriter.MAX_TERM_LENGTH; - + + /** + * Default for {@link #getMaxSyncPauseSeconds}. On + * Windows this defaults to 10.0 seconds; elsewhere it's + * 0. + */ + public final static double DEFAULT_MAX_SYNC_PAUSE_SECONDS; + static { + if (Constants.WINDOWS) + DEFAULT_MAX_SYNC_PAUSE_SECONDS = 10.0; + else + DEFAULT_MAX_SYNC_PAUSE_SECONDS = 0.0; + } + // The normal read buffer size defaults to 1024, but // increasing this during merging seems to yield // performance gains. However we don't want to increase @@ -269,14 +296,18 @@ private Similarity similarity = Similarity.getDefault(); // how to normalize - private boolean commitPending; // true if segmentInfos has changes not yet committed + private volatile boolean commitPending; // true if segmentInfos has changes not yet committed private SegmentInfos rollbackSegmentInfos; // segmentInfos we will fallback to if the commit fails + private HashMap rollbackSegments; private SegmentInfos localRollbackSegmentInfos; // segmentInfos we will fallback to if the commit fails private boolean localAutoCommit; // saved autoCommit during local transaction private boolean autoCommit = true; // false if we should commit only on close private SegmentInfos segmentInfos = new SegmentInfos(); // the segments + private int syncCount; + private int syncCountSaved = -1; + private DocumentsWriter docWriter; private IndexFileDeleter deleter; @@ -302,6 +333,12 @@ private long mergeGen; private boolean stopMerges; + private int flushCount; + private double maxSyncPauseSeconds = DEFAULT_MAX_SYNC_PAUSE_SECONDS; + + // Last (right most) SegmentInfo created by a merge + private SegmentInfo lastMergeInfo; + /** * Used internally to throw an {@link * AlreadyClosedException} if this IndexWriter has been @@ -432,7 +469,9 @@ * Constructs an IndexWriter for the index in path. * Text will be analyzed with a. If create * is true, then a new, empty index will be created in - * path, replacing the index already there, if any. + * path, replacing the index already there, + * if any. Note that autoCommit defaults to true, but + * starting in 3.0 it will be hardwired to false. * * @param path the path to the index directory * @param a the analyzer to use @@ -487,6 +526,8 @@ * Text will be analyzed with a. If create * is true, then a new, empty index will be created in * path, replacing the index already there, if any. + * Note that autoCommit defaults to true, but starting in 3.0 + * it will be hardwired to false. * * @param path the path to the index directory * @param a the analyzer to use @@ -541,6 +582,8 @@ * Text will be analyzed with a. If create * is true, then a new, empty index will be created in * d, replacing the index already there, if any. + * Note that autoCommit defaults to true, but starting in 3.0 + * it will be hardwired to false. * * @param d the index directory * @param a the analyzer to use @@ -595,6 +638,8 @@ * path, first creating it if it does not * already exist. Text will be analyzed with * a. + * Note that autoCommit defaults to true, but starting in 3.0 + * it will be hardwired to false. * * @param path the path to the index directory * @param a the analyzer to use @@ -641,6 +686,8 @@ * path, first creating it if it does not * already exist. Text will be analyzed with * a. + * Note that autoCommit defaults to true, but starting in 3.0 + * it will be hardwired to false. * * @param path the path to the index directory * @param a the analyzer to use @@ -687,6 +734,8 @@ * d, first creating it if it does not * already exist. Text will be analyzed with * a. + * Note that autoCommit defaults to true, but starting in 3.0 + * it will be hardwired to false. * * @param d the index directory * @param a the analyzer to use @@ -746,6 +795,10 @@ * @throws IOException if the directory cannot be * read/written to or if there is any other low-level * IO error + * @deprecated This will be removed in 3.0, when + * autoCommit will be hardwired to false. Use {@link + * #IndexWriter(Directory,Analyzer,MaxFieldLength)} + * instead, and call {@link #commit} when needed. */ public IndexWriter(Directory d, boolean autoCommit, Analyzer a, MaxFieldLength mfl) throws CorruptIndexException, LockObtainFailedException, IOException { @@ -798,6 +851,10 @@ * if it does not exist and create is * false or if there is any other low-level * IO error + * @deprecated This will be removed in 3.0, when + * autoCommit will be hardwired to false. Use {@link + * #IndexWriter(Directory,Analyzer,boolean,MaxFieldLength)} + * instead, and call {@link #commit} when needed. */ public IndexWriter(Directory d, boolean autoCommit, Analyzer a, boolean create, MaxFieldLength mfl) throws CorruptIndexException, LockObtainFailedException, IOException { @@ -837,6 +894,31 @@ * IndexDeletionPolicy}, for the index in d, * first creating it if it does not already exist. Text * will be analyzed with a. + * Note that autoCommit defaults to true, but starting in 3.0 + * it will be hardwired to false. + * + * @param d the index directory + * @param a the analyzer to use + * @param deletionPolicy see above + * @param mfl whether or not to limit field lengths + * @throws CorruptIndexException if the index is corrupt + * @throws LockObtainFailedException if another writer + * has this index open (write.lock could not + * be obtained) + * @throws IOException if the directory cannot be + * read/written to or if there is any other low-level + * IO error + */ + public IndexWriter(Directory d, Analyzer a, IndexDeletionPolicy deletionPolicy, MaxFieldLength mfl) + throws CorruptIndexException, LockObtainFailedException, IOException { + init(d, a, false, deletionPolicy, true, mfl.getLimit()); + } + + /** + * Expert: constructs an IndexWriter with a custom {@link + * IndexDeletionPolicy}, for the index in d, + * first creating it if it does not already exist. Text + * will be analyzed with a. * * @param d the index directory * @param autoCommit see above @@ -851,6 +933,10 @@ * @throws IOException if the directory cannot be * read/written to or if there is any other low-level * IO error + * @deprecated This will be removed in 3.0, when + * autoCommit will be hardwired to false. Use {@link + * #IndexWriter(Directory,Analyzer,IndexDeletionPolicy,MaxFieldLength)} + * instead, and call {@link #commit} when needed. */ public IndexWriter(Directory d, boolean autoCommit, Analyzer a, IndexDeletionPolicy deletionPolicy, MaxFieldLength mfl) throws CorruptIndexException, LockObtainFailedException, IOException { @@ -889,6 +975,37 @@ * create is true, then a new, empty index * will be created in d, replacing the index * already there, if any. + * Note that autoCommit defaults to true, but starting in 3.0 + * it will be hardwired to false. + * + * @param d the index directory + * @param a the analyzer to use + * @param create true to create the index or overwrite + * the existing one; false to append to the existing + * index + * @param deletionPolicy see above + * @param mfl whether or not to limit field lengths + * @throws CorruptIndexException if the index is corrupt + * @throws LockObtainFailedException if another writer + * has this index open (write.lock could not + * be obtained) + * @throws IOException if the directory cannot be read/written to, or + * if it does not exist and create is + * false or if there is any other low-level + * IO error + */ + public IndexWriter(Directory d, Analyzer a, boolean create, IndexDeletionPolicy deletionPolicy, MaxFieldLength mfl) + throws CorruptIndexException, LockObtainFailedException, IOException { + init(d, a, create, false, deletionPolicy, true, mfl.getLimit()); + } + + /** + * Expert: constructs an IndexWriter with a custom {@link + * IndexDeletionPolicy}, for the index in d. + * Text will be analyzed with a. If + * create is true, then a new, empty index + * will be created in d, replacing the index + * already there, if any. * * @param d the index directory * @param autoCommit see above @@ -907,6 +1024,10 @@ * if it does not exist and create is * false or if there is any other low-level * IO error + * @deprecated This will be removed in 3.0, when + * autoCommit will be hardwired to false. Use {@link + * #IndexWriter(Directory,Analyzer,boolean,IndexDeletionPolicy,MaxFieldLength)} + * instead, and call {@link #commit} when needed. */ public IndexWriter(Directory d, boolean autoCommit, Analyzer a, boolean create, IndexDeletionPolicy deletionPolicy, MaxFieldLength mfl) throws CorruptIndexException, LockObtainFailedException, IOException { @@ -984,15 +1105,22 @@ } catch (IOException e) { // Likely this means it's a fresh directory } - segmentInfos.write(directory); + segmentInfos.commit(directory); } else { segmentInfos.read(directory); + + // We assume that this segments_N was previously + // properly sync'd: + for(int i=0;i If an Exception is hit during close, eg due to disk * full or some other reason, then both the on-disk index @@ -1490,33 +1654,16 @@ mergeScheduler.close(); - synchronized(this) { - if (commitPending) { - boolean success = false; - try { - segmentInfos.write(directory); // now commit changes - success = true; - } finally { - if (!success) { - if (infoStream != null) - message("hit exception committing segments file during close"); - deletePartialSegmentsFile(); - } - } - if (infoStream != null) - message("close: wrote segments file \"" + segmentInfos.getCurrentSegmentFileName() + "\""); - - deleter.checkpoint(segmentInfos, true); + if (infoStream != null) + message("now call final sync()"); - commitPending = false; - rollbackSegmentInfos = null; - } + sync(true, 0); - if (infoStream != null) - message("at close: " + segString()); + if (infoStream != null) + message("at close: " + segString()); + synchronized(this) { docWriter = null; - deleter.close(); } @@ -1527,7 +1674,9 @@ writeLock.release(); // release write lock writeLock = null; } - closed = true; + synchronized(this) { + closed = true; + } } finally { synchronized(this) { @@ -1581,34 +1730,24 @@ // Perform the merge cfsWriter.close(); - - for(int i=0;iNote: if autoCommit=false, flushed data would still - * not be visible to readers, until {@link #close} is called. + *

Note: while this will force buffered docs to be + * pushed into the index, it will not make these docs + * visible to a reader. Use {@link #commit} instead * @throws CorruptIndexException if the index is corrupt * @throws IOException if there is a low-level IO error + * @deprecated please call {@link #commit}) instead */ public final void flush() throws CorruptIndexException, IOException { flush(true, false); } /** + *

Commits all pending updates (added & deleted documents) + * to the index, and syncs all referenced index files, + * such that a reader will see the changes. Note that + * this does not wait for any running background merges to + * finish. This may be a costly operation, so you should + * test the cost in your application and do it only when + * really necessary.

+ * + *

Note that this operation calls Directory.sync on + * the index files. That call should not return until the + * file contents & metadata are on stable storage. For + * FSDirectory, this calls the OS's fsync. But, beware: + * some hardware devices may in fact cache writes even + * during fsync, and return before the bits are actually + * on stable storage, to give the appearance of faster + * performance. If you have such a device, and it does + * not have a battery backup (for example) then on power + * loss it may still lose data. Lucene cannot guarantee + * consistency on such devices.

+ */ + public final void commit() throws CorruptIndexException, IOException { + commit(true); + } + + private final void commit(boolean triggerMerges) throws CorruptIndexException, IOException { + flush(triggerMerges, true); + sync(true, 0); + } + + /** * Flush all in-memory buffered udpates (adds and deletes) * to the Directory. * @param triggerMerge if true, we may merge segments (if @@ -2681,10 +2852,15 @@ maybeMerge(); } + // TODO: this method should not have to be entirely + // synchronized, ie, merges should be allowed to commit + // even while a flush is happening private synchronized final boolean doFlush(boolean flushDocStores) throws CorruptIndexException, IOException { // Make sure no threads are actively adding a document + flushCount++; + // Returns true if docWriter is currently aborting, in // which case we skip flushing this segment if (docWriter.pauseAllThreads()) { @@ -2717,10 +2893,18 @@ // apply to more than just the last flushed segment boolean flushDeletes = docWriter.hasDeletes(); + int docStoreOffset = docWriter.getDocStoreOffset(); + + // docStoreOffset should only be non-zero when + // autoCommit == false + assert !autoCommit || 0 == docStoreOffset; + + boolean docStoreIsCompoundFile = false; + if (infoStream != null) { message(" flush: segment=" + docWriter.getSegment() + " docStoreSegment=" + docWriter.getDocStoreSegment() + - " docStoreOffset=" + docWriter.getDocStoreOffset() + + " docStoreOffset=" + docStoreOffset + " flushDocs=" + flushDocs + " flushDeletes=" + flushDeletes + " flushDocStores=" + flushDocStores + @@ -2729,14 +2913,6 @@ message(" index before flush " + segString()); } - int docStoreOffset = docWriter.getDocStoreOffset(); - - // docStoreOffset should only be non-zero when - // autoCommit == false - assert !autoCommit || 0 == docStoreOffset; - - boolean docStoreIsCompoundFile = false; - // Check if the doc stores must be separately flushed // because other segments, besides the one we are about // to flush, reference it @@ -2754,60 +2930,63 @@ // If we are flushing docs, segment must not be null: assert segment != null || !flushDocs; - if (flushDocs || flushDeletes) { - - SegmentInfos rollback = null; - - if (flushDeletes) - rollback = (SegmentInfos) segmentInfos.clone(); + if (flushDocs) { boolean success = false; + final int flushedDocCount; try { - if (flushDocs) { - - if (0 == docStoreOffset && flushDocStores) { - // This means we are flushing private doc stores - // with this segment, so it will not be shared - // with other segments - assert docStoreSegment != null; - assert docStoreSegment.equals(segment); - docStoreOffset = -1; - docStoreIsCompoundFile = false; - docStoreSegment = null; - } - - int flushedDocCount = docWriter.flush(flushDocStores); - - newSegment = new SegmentInfo(segment, - flushedDocCount, - directory, false, true, - docStoreOffset, docStoreSegment, - docStoreIsCompoundFile); - segmentInfos.addElement(newSegment); - } - - if (flushDeletes) { - // we should be able to change this so we can - // buffer deletes longer and then flush them to - // multiple flushed segments, when - // autoCommit=false - applyDeletes(flushDocs); - doAfterFlush(); - } - - checkpoint(); + flushedDocCount = docWriter.flush(flushDocStores); success = true; } finally { if (!success) { - if (infoStream != null) message("hit exception flushing segment " + segment); - - if (flushDeletes) { + docWriter.abort(null); + deleter.refresh(segment); + } + } + + if (0 == docStoreOffset && flushDocStores) { + // This means we are flushing private doc stores + // with this segment, so it will not be shared + // with other segments + assert docStoreSegment != null; + assert docStoreSegment.equals(segment); + docStoreOffset = -1; + docStoreIsCompoundFile = false; + docStoreSegment = null; + } + + // Create new SegmentInfo, but do not add to our + // segmentInfos until deletes are flushed + // successfully. + newSegment = new SegmentInfo(segment, + flushedDocCount, + directory, false, true, + docStoreOffset, docStoreSegment, + docStoreIsCompoundFile); + } + + if (flushDeletes) { + try { + SegmentInfos rollback = (SegmentInfos) segmentInfos.clone(); - // Carefully check if any partial .del files - // should be removed: + boolean success = false; + try { + // we should be able to change this so we can + // buffer deletes longer and then flush them to + // multiple flushed segments only when a commit() + // finally happens + applyDeletes(newSegment); + success = true; + } finally { + if (!success) { + if (infoStream != null) + message("hit exception flushing deletes"); + + // Carefully remove any partially written .del + // files final int size = rollback.size(); for(int i=0;i 0 && - segmentInfos.info(segmentInfos.size()-1) == newSegment) - segmentInfos.remove(segmentInfos.size()-1); - } - if (flushDocs) - docWriter.abort(null); - deletePartialSegmentsFile(); - deleter.checkpoint(segmentInfos, false); - - if (segment != null) - deleter.refresh(segment); + } } + } finally { + // Regardless of success of failure in flushing + // deletes, we must clear them from our buffer: + docWriter.clearBufferedDeletes(); } + } - deleter.checkpoint(segmentInfos, autoCommit); + if (flushDocs) + segmentInfos.addElement(newSegment); - if (flushDocs && mergePolicy.useCompoundFile(segmentInfos, - newSegment)) { - success = false; - try { - docWriter.createCompoundFile(segment); - newSegment.setUseCompoundFile(true); - checkpoint(); - success = true; - } finally { - if (!success) { - if (infoStream != null) - message("hit exception creating compound file for newly flushed segment " + segment); - newSegment.setUseCompoundFile(false); - deleter.deleteFile(segment + "." + IndexFileNames.COMPOUND_FILE_EXTENSION); - deletePartialSegmentsFile(); - } - } + if (flushDocs || flushDeletes) + checkpoint(); - deleter.checkpoint(segmentInfos, autoCommit); + doAfterFlush(); + + if (flushDocs && mergePolicy.useCompoundFile(segmentInfos, newSegment)) { + // Now build compound file + boolean success = false; + try { + docWriter.createCompoundFile(segment); + success = true; + } finally { + if (!success) { + if (infoStream != null) + message("hit exception creating compound file for newly flushed segment " + segment); + deleter.deleteFile(segment + "." + IndexFileNames.COMPOUND_FILE_EXTENSION); + } } - - return true; - } else { - return false; + + newSegment.setUseCompoundFile(true); + checkpoint(); } + + return flushDocs || flushDeletes; } finally { docWriter.clearFlushPending(); @@ -2913,9 +3086,101 @@ return first; } + /** Carefully merges deletes for the segments we just + * merged. This is tricky because, although merging will + * clear all deletes (compacts the documents), new + * deletes may have been flushed to the segments since + * the merge was started. This method "carries over" + * such new deletes onto the newly merged segment, and + * saves the results deletes file (incrementing the + * delete generation for merge.info). If no deletes were + * flushed, no new deletes file is saved. */ + synchronized private void commitMergedDeletes(MergePolicy.OneMerge merge) throws IOException { + final SegmentInfos sourceSegmentsClone = merge.segmentsClone; + final SegmentInfos sourceSegments = merge.segments; + + if (infoStream != null) + message("commitMerge " + merge.segString(directory)); + + // Carefully merge deletes that occurred after we + // started merging: + + BitVector deletes = null; + int docUpto = 0; + + final int numSegmentsToMerge = sourceSegments.size(); + for(int i=0;i 0; - boolean success = false; try { try { - if (merge.info == null) - mergeInit(merge); + mergeInit(merge); if (infoStream != null) message("now merge\n merge=" + merge.segString(directory) + "\n index=" + segString()); @@ -3131,11 +3286,17 @@ } finally { synchronized(this) { try { - if (!success && infoStream != null) - message("hit exception during merge"); mergeFinish(merge); + if (!success) { + if (infoStream != null) + message("hit exception during merge"); + addMergeException(merge); + if (merge.info != null && !segmentInfos.contains(merge.info)) + deleter.refresh(merge.info.name); + } + // This merge (and, generally, any change to the // segments) may now enable new merges, so we call // merge policy & update pending merges. @@ -3200,6 +3361,11 @@ final synchronized void mergeInit(MergePolicy.OneMerge merge) throws IOException { assert merge.registerDone; + assert !merge.optimize || merge.maxNumSegmentsOptimize > 0; + + if (merge.info != null) + // mergeInit already done + return; if (merge.isAborted()) return; @@ -3323,6 +3489,50 @@ docStoreOffset, docStoreSegment, docStoreIsCompoundFile); + + // Also enroll the merged segment into mergingSegments; + // this prevents it from getting selected for a merge + // after our merge is done but while we are building the + // CFS: + mergingSegments.add(merge.info); + } + + /** This is called after merging a segment and before + * building its CFS. Return true if the files should be + * sync'd. If you return false, then the source segment + * files that were merged cannot be deleted until the CFS + * file is built & sync'd. So, returning false consumes + * more transient disk space, but saves performance of + * not having to sync files which will shortly be deleted + * anyway. + * @deprecated -- this will be removed in 3.0 when + * autoCommit is hardwired to false */ + private synchronized boolean doCommitBeforeMergeCFS(MergePolicy.OneMerge merge) throws IOException { + long freeableBytes = 0; + final int size = merge.segments.size(); + for(int i=0;i totalBytes) + return true; + else + return false; } /** Does fininishing for a merge, which is fast but holds @@ -3338,6 +3548,7 @@ final int end = sourceSegments.size(); for(int i=0;i X minutes or + // more than Y bytes have been written, etc. + if (autoCommit) + sync(false, merge.info.sizeInBytes()); + return mergedDocCount; } @@ -3495,23 +3676,11 @@ mergeExceptions.add(merge); } - private void deletePartialSegmentsFile() throws IOException { - if (segmentInfos.getLastGeneration() != segmentInfos.getGeneration()) { - String segmentFileName = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS, - "", - segmentInfos.getGeneration()); - if (infoStream != null) - message("now delete partial segments file \"" + segmentFileName + "\""); - - deleter.deleteFile(segmentFileName); - } - } - // Called during flush to apply any buffered deletes. If // flushedNewSegment is true then a new segment was just // created and flushed from the ram segments, so we will // selectively apply the deletes to that new segment. - private final void applyDeletes(boolean flushedNewSegment) throws CorruptIndexException, IOException { + private final void applyDeletes(SegmentInfo newSegment) throws CorruptIndexException, IOException { final HashMap bufferedDeleteTerms = docWriter.getBufferedDeleteTerms(); final List bufferedDeleteDocIDs = docWriter.getBufferedDeleteDocIDs(); @@ -3521,13 +3690,13 @@ bufferedDeleteDocIDs.size() + " deleted docIDs on " + segmentInfos.size() + " segments."); - if (flushedNewSegment) { + if (newSegment != null) { IndexReader reader = null; try { // Open readers w/o opening the stored fields / // vectors because these files may still be held // open for writing by docWriter - reader = SegmentReader.get(segmentInfos.info(segmentInfos.size() - 1), false); + reader = SegmentReader.get(newSegment, false); // Apply delete terms to the segment just flushed from ram // apply appropriately so that a delete term is only applied to @@ -3544,10 +3713,7 @@ } } - int infosEnd = segmentInfos.size(); - if (flushedNewSegment) { - infosEnd--; - } + final int infosEnd = segmentInfos.size(); for (int i = 0; i < infosEnd; i++) { IndexReader reader = null; @@ -3567,9 +3733,6 @@ } } } - - // Clean up bufferedDeleteTerms. - docWriter.clearBufferedDeletes(); } // For test purposes. @@ -3642,6 +3805,236 @@ } return buffer.toString(); + } + + // Files that have been sync'd already + private HashSet synced = new HashSet(); + + // Files that are now being sync'd + private HashSet syncing = new HashSet(); + + private boolean startSync(String fileName, Collection pending) { + synchronized(synced) { + if (!synced.contains(fileName)) { + if (!syncing.contains(fileName)) { + syncing.add(fileName); + return true; + } else { + pending.add(fileName); + return false; + } + } else + return false; + } + } + + private void finishSync(String fileName, boolean success) { + synchronized(synced) { + assert syncing.contains(fileName); + syncing.remove(fileName); + if (success) + synced.add(fileName); + synced.notifyAll(); + } + } + + /** Blocks until all files in syncing are sync'd */ + private boolean waitForAllSynced(Collection syncing) throws IOException { + synchronized(synced) { + Iterator it = syncing.iterator(); + while(it.hasNext()) { + final String fileName = (String) it.next(); + while(!synced.contains(fileName)) { + if (!syncing.contains(fileName)) + // There was an error because a file that was + // previously syncing failed to appear in synced + return false; + else + try { + synced.wait(); + } catch (InterruptedException ie) { + continue; + } + } + } + return true; + } + } + + /** Pauses before syncing. On Windows, at least, it's + * best (performance-wise) to pause in order to let OS + * flush writes to disk on its own, before forcing a + * sync. + * @deprecated -- this will be removed in 3.0 when + * autoCommit is hardwired to false */ + private void syncPause(long sizeInBytes) { + if (mergeScheduler instanceof ConcurrentMergeScheduler && maxSyncPauseSeconds > 0) { + // Rough heuristic: for every 10 MB, we pause for 1 + // second, up until the max + long pauseTime = (long) (1000*sizeInBytes/10/1024/1024); + final long maxPauseTime = (long) (maxSyncPauseSeconds*1000); + if (pauseTime > maxPauseTime) + pauseTime = maxPauseTime; + final int sleepCount = (int) (pauseTime / 100); + for(int i=0;i 0) + // Force all subsequent syncs to include up through + // the final info in the current segments. This + // ensure that a call to commit() will force another + // sync (due to merge finishing) to sync all flushed + // segments as well: + lastMergeInfo = toSync.info(numSegmentsToSync-1); + + mySyncCount = syncCount++; + deleter.incRef(toSync, false); + + commitPending = newCommitPending; + } + + boolean success0 = false; + + try { + + // Loop until all files toSync references are sync'd: + while(true) { + + final Collection pending = new ArrayList(); + + for(int i=0;i syncCountSaved) { + + if (segmentInfos.getGeneration() > toSync.getGeneration()) + toSync.updateGeneration(segmentInfos); + + boolean success = false; + try { + toSync.commit(directory); + success = true; + } finally { + // Have our master segmentInfos record the + // generations we just sync'd + segmentInfos.updateGeneration(toSync); + if (!success) { + commitPending = true; + message("hit exception committing segments file"); + } + } + message("commit complete"); + + syncCountSaved = mySyncCount; + + deleter.checkpoint(toSync, true); + setRollbackSegmentInfos(); + } else + message("sync superseded by newer infos"); + } + + message("done all syncs"); + + success0 = true; + + } finally { + synchronized(this) { + deleter.decRef(toSync); + if (!success0) + commitPending = true; + } + } } /** Modified: lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfos.java URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfos.java?rev=620576&r1=620575&r2=620576&view=diff ============================================================================== --- lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfos.java (original) +++ lucene/java/trunk/src/java/org/apache/lucene/index/SegmentInfos.java Mon Feb 11 10:56:09 2008 @@ -20,6 +20,8 @@ import org.apache.lucene.store.Directory; import org.apache.lucene.store.IndexInput; import org.apache.lucene.store.IndexOutput; +import org.apache.lucene.store.ChecksumIndexOutput; +import org.apache.lucene.store.ChecksumIndexInput; import java.io.File; import java.io.FileNotFoundException; @@ -55,8 +57,12 @@ * vectors and stored fields file. */ public static final int FORMAT_SHARED_DOC_STORE = -4; + /** This format adds a checksum at the end of the file to + * ensure all bytes were successfully written. */ + public static final int FORMAT_CHECKSUM = -5; + /* This must always point to the most recent file format. */ - private static final int CURRENT_FORMAT = FORMAT_SHARED_DOC_STORE; + private static final int CURRENT_FORMAT = FORMAT_CHECKSUM; public int counter = 0; // used to name new segments /** @@ -197,7 +203,7 @@ // Clear any previous segments: clear(); - IndexInput input = directory.openInput(segmentFileName); + ChecksumIndexInput input = new ChecksumIndexInput(directory.openInput(segmentFileName)); generation = generationFromSegmentsFileName(segmentFileName); @@ -226,6 +232,13 @@ else version = input.readLong(); // read version } + + if (format <= FORMAT_CHECKSUM) { + final long checksumNow = input.getChecksum(); + final long checksumThen = input.readLong(); + if (checksumNow != checksumThen) + throw new CorruptIndexException("checksum mismatch in segments file"); + } success = true; } finally { @@ -257,7 +270,7 @@ }.run(); } - public final void write(Directory directory) throws IOException { + private final void write(Directory directory) throws IOException { String segmentFileName = getNextSegmentFileName(); @@ -268,7 +281,7 @@ generation++; } - IndexOutput output = directory.createOutput(segmentFileName); + ChecksumIndexOutput output = new ChecksumIndexOutput(directory.createOutput(segmentFileName)); boolean success = false; @@ -280,29 +293,31 @@ output.writeInt(size()); // write infos for (int i = 0; i < size(); i++) { info(i).write(output); - } - } - finally { + } + final long checksum = output.getChecksum(); + output.writeLong(checksum); + success = true; + } finally { + boolean success2 = false; try { output.close(); - success = true; + success2 = true; } finally { - if (!success) { + if (!success || !success2) // Try not to leave a truncated segments_N file in // the index: directory.deleteFile(segmentFileName); - } } } try { - output = directory.createOutput(IndexFileNames.SEGMENTS_GEN); + IndexOutput genOutput = directory.createOutput(IndexFileNames.SEGMENTS_GEN); try { - output.writeInt(FORMAT_LOCKLESS); - output.writeLong(generation); - output.writeLong(generation); + genOutput.writeInt(FORMAT_LOCKLESS); + genOutput.writeLong(generation); + genOutput.writeLong(generation); } finally { - output.close(); + genOutput.close(); } } catch (IOException e) { // It's OK if we fail to write this file since it's @@ -620,7 +635,7 @@ retry = true; } - } else { + } else if (0 == method) { // Segment file has advanced since our last loop, so // reset retry: retry = false; @@ -700,5 +715,51 @@ SegmentInfos infos = new SegmentInfos(); infos.addAll(super.subList(first, last)); return infos; + } + + // Carry over generation numbers from another SegmentInfos + void updateGeneration(SegmentInfos other) { + assert other.generation > generation; + lastGeneration = other.lastGeneration; + generation = other.generation; + } + + /** Writes & syncs to the Directory dir, taking care to + * remove the segments file on exception */ + public final void commit(Directory dir) throws IOException { + boolean success = false; + try { + write(dir); + success = true; + } finally { + if (!success) { + // Must carefully compute fileName from "generation" + // since lastGeneration isn't incremented: + final String segmentFileName = IndexFileNames.fileNameFromGeneration(IndexFileNames.SEGMENTS, + "", + generation); + dir.deleteFile(segmentFileName); + } + } + + // NOTE: if we crash here, we have left a segments_N + // file in the directory in a possibly corrupt state (if + // some bytes made it to stable storage and others + // didn't). But, the segments_N file now includes + // checksum at the end, which should catch this case. + // So when a reader tries to read it, it will throw a + // CorruptIndexException, which should cause the retry + // logic in SegmentInfos to kick in and load the last + // good (previous) segments_N-1 file. + + final String fileName = getCurrentSegmentFileName(); + success = false; + try { + dir.sync(fileName); + success = true; + } finally { + if (!success) + dir.deleteFile(fileName); + } } } Added: lucene/java/trunk/src/java/org/apache/lucene/store/ChecksumIndexInput.java URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/store/ChecksumIndexInput.java?rev=620576&view=auto ============================================================================== --- lucene/java/trunk/src/java/org/apache/lucene/store/ChecksumIndexInput.java (added) +++ lucene/java/trunk/src/java/org/apache/lucene/store/ChecksumIndexInput.java Mon Feb 11 10:56:09 2008 @@ -0,0 +1,67 @@ +package org.apache.lucene.store; + +/** + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import java.io.IOException; +import java.util.zip.CRC32; +import java.util.zip.Checksum; + +/** Writes bytes through to a primary IndexOutput, computing + * checksum as it goes. Note that you cannot use seek(). */ +public class ChecksumIndexInput extends IndexInput { + IndexInput main; + Checksum digest; + + public ChecksumIndexInput(IndexInput main) { + this.main = main; + digest = new CRC32(); + } + + public byte readByte() throws IOException { + final byte b = main.readByte(); + digest.update(b); + return b; + } + + public void readBytes(byte[] b, int offset, int len) + throws IOException { + main.readBytes(b, offset, len); + digest.update(b, offset, len); + } + + + public long getChecksum() { + return digest.getValue(); + } + + public void close() throws IOException { + main.close(); + } + + public long getFilePointer() { + return main.getFilePointer(); + } + + public void seek(long pos) { + throw new RuntimeException("not allowed"); + } + + public long length() { + return main.length(); + } +} Propchange: lucene/java/trunk/src/java/org/apache/lucene/store/ChecksumIndexInput.java ------------------------------------------------------------------------------ svn:eol-style = native Added: lucene/java/trunk/src/java/org/apache/lucene/store/ChecksumIndexOutput.java URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/store/ChecksumIndexOutput.java?rev=620576&view=auto ============================================================================== --- lucene/java/trunk/src/java/org/apache/lucene/store/ChecksumIndexOutput.java (added) +++ lucene/java/trunk/src/java/org/apache/lucene/store/ChecksumIndexOutput.java Mon Feb 11 10:56:09 2008 @@ -0,0 +1,68 @@ +package org.apache.lucene.store; + +/** + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import java.io.IOException; +import java.util.zip.CRC32; +import java.util.zip.Checksum; + +/** Writes bytes through to a primary IndexOutput, computing + * checksum. Note that you cannot use seek().*/ +public class ChecksumIndexOutput extends IndexOutput { + IndexOutput main; + Checksum digest; + + public ChecksumIndexOutput(IndexOutput main) { + this.main = main; + digest = new CRC32(); + } + + public void writeByte(byte b) throws IOException { + digest.update(b); + main.writeByte(b); + } + + public void writeBytes(byte[] b, int offset, int length) throws IOException { + digest.update(b, offset, length); + main.writeBytes(b, offset, length); + } + + public long getChecksum() { + return digest.getValue(); + } + + public void flush() throws IOException { + main.flush(); + } + + public void close() throws IOException { + main.close(); + } + + public long getFilePointer() { + return main.getFilePointer(); + } + + public void seek(long pos) { + throw new RuntimeException("not allowed"); + } + + public long length() throws IOException { + return main.length(); + } +} Propchange: lucene/java/trunk/src/java/org/apache/lucene/store/ChecksumIndexOutput.java ------------------------------------------------------------------------------ svn:eol-style = native Modified: lucene/java/trunk/src/java/org/apache/lucene/store/Directory.java URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/store/Directory.java?rev=620576&r1=620575&r2=620576&view=diff ============================================================================== --- lucene/java/trunk/src/java/org/apache/lucene/store/Directory.java (original) +++ lucene/java/trunk/src/java/org/apache/lucene/store/Directory.java Mon Feb 11 10:56:09 2008 @@ -83,6 +83,11 @@ Returns a stream writing this file. */ public abstract IndexOutput createOutput(String name) throws IOException; + /** Ensure that any writes to this file are moved to + * stable storage. Lucene uses this to properly commit + * changes to the index, to prevent a machine/OS crash + * from corrupting the index. */ + public void sync(String name) throws IOException {} /** Returns a stream reading an existing file. */ public abstract IndexInput openInput(String name) Modified: lucene/java/trunk/src/java/org/apache/lucene/store/FSDirectory.java URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/store/FSDirectory.java?rev=620576&r1=620575&r2=620576&view=diff ============================================================================== --- lucene/java/trunk/src/java/org/apache/lucene/store/FSDirectory.java (original) +++ lucene/java/trunk/src/java/org/apache/lucene/store/FSDirectory.java Mon Feb 11 10:56:09 2008 @@ -435,6 +435,39 @@ return new FSIndexOutput(file); } + public void sync(String name) throws IOException { + File fullFile = new File(directory, name); + boolean success = false; + int retryCount = 0; + IOException exc = null; + while(!success && retryCount < 5) { + retryCount++; + RandomAccessFile file = null; + try { + try { + file = new RandomAccessFile(fullFile, "rw"); + file.getFD().sync(); + success = true; + } finally { + if (file != null) + file.close(); + } + } catch (IOException ioe) { + if (exc == null) + exc = ioe; + try { + // Pause 5 msec + Thread.sleep(5); + } catch (InterruptedException ie) { + Thread.currentThread().interrupt(); + } + } + } + if (!success) + // Throw original exception + throw exc; + } + // Inherit javadoc public IndexInput openInput(String name) throws IOException { return openInput(name, BufferedIndexInput.BUFFER_SIZE); Modified: lucene/java/trunk/src/site/src/documentation/content/xdocs/fileformats.xml URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/site/src/documentation/content/xdocs/fileformats.xml?rev=620576&r1=620575&r2=620576&view=diff ============================================================================== --- lucene/java/trunk/src/site/src/documentation/content/xdocs/fileformats.xml (original) +++ lucene/java/trunk/src/site/src/documentation/content/xdocs/fileformats.xml Mon Feb 11 10:56:09 2008 @@ -819,18 +819,24 @@ IsCompoundFile>SegCount

- 2.3 and above: + 2.3: Segments --> Format, Version, NameCounter, SegCount, <SegName, SegSize, DelGen, DocStoreOffset, [DocStoreSegment, DocStoreIsCompoundFile], HasSingleNormFile, NumField, NormGenNumField, IsCompoundFile>SegCount

+

+ 2.4 and above: + Segments --> Format, Version, NameCounter, SegCount, <SegName, SegSize, DelGen, DocStoreOffset, [DocStoreSegment, DocStoreIsCompoundFile], HasSingleNormFile, NumField, + NormGenNumField, + IsCompoundFile>SegCount, Checksum +

Format, NameCounter, SegCount, SegSize, NumField, DocStoreOffset --> Int32

- Version, DelGen, NormGen --> Int64 + Version, DelGen, NormGen, Checksum --> Int64

@@ -842,7 +848,7 @@

- Format is -1 as of Lucene 1.4, -3 (SegmentInfos.FORMAT_SINGLE_NORM_FILE) as of Lucene 2.1 and 2.2, and -4 (SegmentInfos.FORMAT_SHARED_DOC_STORE) as of Lucene 2.3 + Format is -1 as of Lucene 1.4, -3 (SegmentInfos.FORMAT_SINGLE_NORM_FILE) as of Lucene 2.1 and 2.2, -4 (SegmentInfos.FORMAT_SHARED_DOC_STORE) as of Lucene 2.3 and -5 (SegmentInfos.FORMAT_CHECKSUM) as of Lucene 2.4.

@@ -925,6 +931,13 @@ shares a single set of these files with other segments.

+ +

+ Checksum contains the CRC32 checksum of all bytes + in the segments_N file up until the checksum. + This is used to verify integrity of the file on + opening the index. +

Modified: lucene/java/trunk/src/test/org/apache/lucene/index/TestAtomicUpdate.java URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/test/org/apache/lucene/index/TestAtomicUpdate.java?rev=620576&r1=620575&r2=620576&view=diff ============================================================================== --- lucene/java/trunk/src/test/org/apache/lucene/index/TestAtomicUpdate.java (original) +++ lucene/java/trunk/src/test/org/apache/lucene/index/TestAtomicUpdate.java Mon Feb 11 10:56:09 2008 @@ -20,12 +20,8 @@ import org.apache.lucene.store.*; import org.apache.lucene.document.*; import org.apache.lucene.analysis.*; -import org.apache.lucene.index.*; import org.apache.lucene.search.*; import org.apache.lucene.queryParser.*; -import org.apache.lucene.util._TestUtil; - -import org.apache.lucene.util.LuceneTestCase; import java.util.Random; import java.io.File; @@ -83,7 +79,6 @@ // Update all 100 docs... for(int i=0; i<100; i++) { Document d = new Document(); - int n = RANDOM.nextInt(); d.add(new Field("id", Integer.toString(i), Field.Store.YES, Field.Index.UN_TOKENIZED)); d.add(new Field("contents", English.intToEnglish(i+10*count), Field.Store.NO, Field.Index.TOKENIZED)); writer.updateDocument(new Term("id", Integer.toString(i)), d); @@ -127,7 +122,7 @@ d.add(new Field("contents", English.intToEnglish(i), Field.Store.NO, Field.Index.TOKENIZED)); writer.addDocument(d); } - writer.flush(); + writer.commit(); IndexerThread indexerThread = new IndexerThread(writer, threads); threads[0] = indexerThread; Modified: lucene/java/trunk/src/test/org/apache/lucene/index/TestBackwardsCompatibility.java URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/test/org/apache/lucene/index/TestBackwardsCompatibility.java?rev=620576&r1=620575&r2=620576&view=diff ============================================================================== --- lucene/java/trunk/src/test/org/apache/lucene/index/TestBackwardsCompatibility.java (original) +++ lucene/java/trunk/src/test/org/apache/lucene/index/TestBackwardsCompatibility.java Mon Feb 11 10:56:09 2008 @@ -349,7 +349,6 @@ IndexWriter writer = new IndexWriter(dir, autoCommit, new WhitespaceAnalyzer(), true, IndexWriter.MaxFieldLength.LIMITED); writer.setRAMBufferSizeMB(16.0); - //IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true); for(int i=0;i<35;i++) { addDoc(writer, i); } @@ -390,11 +389,8 @@ expected = new String[] {"_0.cfs", "_0_1.del", "_0_1.s" + contentFieldIndex, - "segments_4", + "segments_3", "segments.gen"}; - - if (!autoCommit) - expected[3] = "segments_3"; String[] actual = dir.list(); Arrays.sort(expected); Added: lucene/java/trunk/src/test/org/apache/lucene/index/TestCrash.java URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/test/org/apache/lucene/index/TestCrash.java?rev=620576&view=auto ============================================================================== --- lucene/java/trunk/src/test/org/apache/lucene/index/TestCrash.java (added) +++ lucene/java/trunk/src/test/org/apache/lucene/index/TestCrash.java Mon Feb 11 10:56:09 2008 @@ -0,0 +1,181 @@ +package org.apache.lucene.index; + +/** + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import java.io.IOException; + +import org.apache.lucene.util.LuceneTestCase; +import org.apache.lucene.analysis.WhitespaceAnalyzer; +import org.apache.lucene.store.MockRAMDirectory; +import org.apache.lucene.store.NoLockFactory; +import org.apache.lucene.document.Document; +import org.apache.lucene.document.Field; + +public class TestCrash extends LuceneTestCase { + + private IndexWriter initIndex() throws IOException { + return initIndex(new MockRAMDirectory()); + } + + private IndexWriter initIndex(MockRAMDirectory dir) throws IOException { + dir.setLockFactory(NoLockFactory.getNoLockFactory()); + + IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer()); + //writer.setMaxBufferedDocs(2); + writer.setMaxBufferedDocs(10); + ((ConcurrentMergeScheduler) writer.getMergeScheduler()).setSuppressExceptions(); + + Document doc = new Document(); + doc.add(new Field("content", "aaa", Field.Store.YES, Field.Index.TOKENIZED)); + doc.add(new Field("id", "0", Field.Store.YES, Field.Index.TOKENIZED)); + for(int i=0;i<157;i++) + writer.addDocument(doc); + + return writer; + } + + private void crash(final IndexWriter writer) throws IOException { + final MockRAMDirectory dir = (MockRAMDirectory) writer.getDirectory(); + ConcurrentMergeScheduler cms = (ConcurrentMergeScheduler) writer.getMergeScheduler(); + dir.crash(); + cms.sync(); + dir.clearCrash(); + } + + public void testCrashWhileIndexing() throws IOException { + IndexWriter writer = initIndex(); + MockRAMDirectory dir = (MockRAMDirectory) writer.getDirectory(); + crash(writer); + IndexReader reader = IndexReader.open(dir); + assertTrue(reader.numDocs() < 157); + } + + public void testWriterAfterCrash() throws IOException { + IndexWriter writer = initIndex(); + MockRAMDirectory dir = (MockRAMDirectory) writer.getDirectory(); + dir.setPreventDoubleWrite(false); + crash(writer); + writer = initIndex(dir); + writer.close(); + + IndexReader reader = IndexReader.open(dir); + assertTrue(reader.numDocs() < 314); + } + + public void testCrashAfterReopen() throws IOException { + IndexWriter writer = initIndex(); + MockRAMDirectory dir = (MockRAMDirectory) writer.getDirectory(); + writer.close(); + writer = initIndex(dir); + assertEquals(314, writer.docCount()); + crash(writer); + + /* + System.out.println("\n\nTEST: open reader"); + String[] l = dir.list(); + Arrays.sort(l); + for(int i=0;i= 157); + } + + public void testCrashAfterClose() throws IOException { + + IndexWriter writer = initIndex(); + MockRAMDirectory dir = (MockRAMDirectory) writer.getDirectory(); + + writer.close(); + dir.crash(); + + /* + String[] l = dir.list(); + Arrays.sort(l); + for(int i=0;i