lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: CorruptIndexException with some versions of java
Date Tue, 18 Mar 2008 14:03:54 GMT

I don't see an attachment here -- maybe the mailing list software  
stripped it off.  If so can you send directly to me?  Thanks.

Mike

Ian Lea wrote:

> Documents are biblio records.  All have title, author etc. stored,
> some have a few extra fields as well.  Typically around 25 fields per
> doc.  The index is created with compound format, everything else as
> default.
>
> I've rerun the job until failure.  Different numbers this time, but
> basically the same exception. In the failing pass it was trying to
> load 100K docs.
>
> Exception in thread "Thread-1"
> org.apache.lucene.index.MergePolicy$MergeException:
> org.apache.lucene.index.CorruptIndexException: doc counts differ for
> segment _cb: fieldsReader shows 71663 but segmentInfo shows 71664
>         at org.apache.lucene.index.ConcurrentMergeScheduler 
> $MergeThread.run(ConcurrentMergeScheduler.java:271)
> Caused by: org.apache.lucene.index.CorruptIndexException: doc counts
> differ for segment _cb: fieldsReader shows 71663 but segmentInfo shows
>  71664
>         at org.apache.lucene.index.SegmentReader.initialize 
> (SegmentReader.java:313)
>         at org.apache.lucene.index.SegmentReader.get 
> (SegmentReader.java:262)
>         at org.apache.lucene.index.SegmentReader.get 
> (SegmentReader.java:221)
>         at org.apache.lucene.index.IndexWriter.mergeMiddle 
> (IndexWriter.java:3099)
>         at org.apache.lucene.index.IndexWriter.merge 
> (IndexWriter.java:2834)
>         at org.apache.lucene.index.ConcurrentMergeScheduler 
> $MergeThread.run(ConcurrentMergeScheduler.java:240)
>
> As mentioned, the index is loaded in chunks - the infostream from the
> failing pass is attached. All infostreams from all 19 odd runs leading
> up to the failure available as well if that would help.
>
>
> Running with -ea doesn't seem to have made any difference.
>
>
> --
> Ian.
>
>
> On Tue, Mar 18, 2008 at 12:09 PM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>>
>>  Can you call IndexWriter.setInfoStream(...) and get the error to
>>  happen and post back the resulting output?  And, turn on assertions
>>  (java -ea) since that may catch the issue sooner.
>>
>>  Can you describe you are setting up IndexWriter (autoCommit,
>>  compound, etc.), and what your documents are like?  Do your  
>> documents
>>  have a fixed schema (same fields every time), or it varies such that
>>  some documents have no stored fields and some do?
>>
>>  Mike
>>
>>
>>
>>  Ian Lea wrote:
>>
>>> Hi
>>>
>>>
>>> When bulk loading into a new index I'm seeing this exception
>>>
>>> Exception in thread "Thread-1"
>>> org.apache.lucene.index.MergePolicy$MergeException:
>>> org.apache.lucene.index.CorruptIndexException: doc counts differ for
>>> segment _4l: fieldsReader shows 67861 but segmentInfo shows 67862
>>>       at org.apache.lucene.index.ConcurrentMergeScheduler 
>>> $MergeThread.run
>>> (ConcurrentMergeScheduler.java:271)
>>> Caused by: org.apache.lucene.index.CorruptIndexException: doc counts
>>> differ for segment _4l: fieldsReader shows 67861 but segmentInfo  
>>> shows
>>> 67862
>>>       at org.apache.lucene.index.SegmentReader.initialize
>>> (SegmentReader.java:313)
>>>       at org.apache.lucene.index.SegmentReader.get 
>>> (SegmentReader.java:262)
>>>       at org.apache.lucene.index.SegmentReader.get 
>>> (SegmentReader.java:221)
>>>       at org.apache.lucene.index.IndexWriter.mergeMiddle
>>> (IndexWriter.java:3093)
>>>       at org.apache.lucene.index.IndexWriter.merge 
>>> (IndexWriter.java:2834)
>>>       at org.apache.lucene.index.ConcurrentMergeScheduler 
>>> $MergeThread.run
>>> (ConcurrentMergeScheduler.java:240)
>>>
>>> when use java version 1.6.0_05-b13 or 1.6.0_04-b12 on linux, with
>>> lucene 2.3.0 or 2.3.1 or lucene-core-2.3-SNAPSHOT from yesterday.
>>>
>>> With java version 1.6.0_03-b05 things work fine.
>>>
>>> The exception happens a few hundred thousand documents into the  
>>> load.
>>>
>>> A different program updating a different index with different  
>>> data on
>>> a different server gave a similar error on version 1.6.0_05-b13 and
>>> lucene 2.3.0.
>>>
>>>
>>> Any ideas?  Is this maybe a known issue or am I missing something
>>> obvious?
>>>
>>>
>>>
>>> --
>>> Ian.
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>>
>>   
>> ---------------------------------------------------------------------
>>  To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>  For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message