lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tyler V" <tyler...@gmail.com>
Subject Re: Corrupted Indexes Under Lucene 2.3 (and 2.3.1)
Date Sat, 01 Mar 2008 00:05:00 GMT
Mike -- Thanks so much for the prompt reply.

You are right, we are accessing these documents with multiple threads
(and have always been). However, I am wondering if the increased
indexing speed in 2.3 has revealed a hidden concurrency issue.

I am going to add in some additional concurrency checks, if I have
questions I will post another question to the list.

Thanks again for the reply.

Tyler

On Fri, Feb 29, 2008 at 2:46 PM, Michael McCandless
<lucene@mikemccandless.com> wrote:
>
>  Not good!  (I'm sorry).
>
>  That first exception is worrisome.  It's the root cause here.
>
>  Can you describe your documents?  That exception, if I'm reading it
>  right, seems to imply that you have documents with 4762 fields.  Is
>  that right?
>
>  Are you using multiple threads?  Is it possible that you cal
>  addDocument with a given document, but another thread removing fields
>  from that document (or, removing elements from the List returned by
>  Document.getFields())?  That's the only way I can explain that first
>  exception.
>
>  Mike
>
>
>
>  Tyler V wrote:
>
>  > After upgrading to Lucene 2.3 (and subsequently 2.3.1), our
>  > application has experienced sporadic index corruptions on our larger
>  > (and more frequently updated) indexes. These indexes experienced no
>  > corruptions under any prior version of Lucene (which we have been
>  > using for several years).
>  >
>  > The pattern of failure seems to be consistent.  First, we receive an
>  > exception like the following:
>  >
>  > java.lang.IndexOutOfBoundsException: Index: 4788, Size: 4762
>  >         at java.util.ArrayList.RangeCheck(ArrayList.java:547)
>  >         at java.util.ArrayList.get(ArrayList.java:322)
>  >         at org.apache.lucene.index.DocumentsWriter$ThreadState.init
>  > (DocumentsWriter.java:749)
>  >         at org.apache.lucene.index.DocumentsWriter.getThreadState
>  > (DocumentsWriter.java:2391)
>  >         at org.apache.lucene.index.DocumentsWriter.updateDocument
>  > (DocumentsWriter.java:2434)
>  >         at org.apache.lucene.index.DocumentsWriter.addDocument
>  > (DocumentsWriter.java:2422)
>  >         at org.apache.lucene.index.IndexWriter.addDocument
>  > (IndexWriter.java:1445)
>  >         at org.apache.lucene.index.IndexWriter.addDocument
>  > (IndexWriter.java:1424)
>  >         at com.myapp.indexing.IndexerRunner.run(IndexerRunner.java:
>  > 134)
>  >         at java.lang.Thread.run(Thread.java:619)
>  >
>  > When we experience this error, we run a writer.flush() then a
>  > writer.close().
>  >
>  > Then, we get this exception when trying to re-open the index:
>  >
>  > org.apache.lucene.index.CorruptIndexException: doc counts differ for
>  > segment _c2z13: fieldsReader shows 2 but segmentInfo shows 3
>  >         at org.apache.lucene.index.SegmentReader.initialize
>  > (SegmentReader.java:313)
>  >         at org.apache.lucene.index.SegmentReader.get
>  > (SegmentReader.java:262)
>  >         at org.apache.lucene.index.SegmentReader.get
>  > (SegmentReader.java:197)
>  >         at org.apache.lucene.index.MultiSegmentReader.<init>
>  > (MultiSegmentReader.java:55)
>  >         at org.apache.lucene.index.DirectoryIndexReader$1.doBody
>  > (DirectoryIndexReader.java:75)
>  >         at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run
>  > (SegmentInfos.java:636)
>  >         at org.apache.lucene.index.DirectoryIndexReader.open
>  > (DirectoryIndexReader.java:63)
>  >         at org.apache.lucene.index.IndexReader.open
>  > (IndexReader.java:209)
>  >         at org.apache.lucene.index.IndexReader.open
>  > (IndexReader.java:192)
>  >         at com.myapp.indexing.IndexerRunner.run(IndexerRunner.java:
>  > 107)
>  >         at java.lang.Thread.run(Thread.java:619)
>  >
>  > Running the check index application included with 2.3 enables us to
>  > remove the bad documents from the index, but this workaround is less
>  > than desirable.  It would be greatly appreciated if anyone could shed
>  > some light on our issue.
>  >
>  > Regards,
>  >
>  > Tyler
>  >
>  > ---------------------------------------------------------------------
>  > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>  > For additional commands, e-mail: java-user-help@lucene.apache.org
>  >
>
>
>  ---------------------------------------------------------------------
>  To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>  For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message