lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: CorruptIndexException with some versions of java
Date Tue, 18 Mar 2008 20:53:25 GMT

Ian can you attach your version of SegmentMerger.java?  Somehow my  
lines are off from yours.

Mike

Ian Lea wrote:
> Mike
>
>
> Latest patch produces similar exception:
>
> Exception in thread "Lucene Merge Thread #0"
> org.apache.lucene.index.MergePolicy$MergeException:
> java.lang.AssertionError: after mergeFields: fdx size mismatch: 65184
> docs vs 521464 length in bytes of _c9.fdx
>         at  
> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException( 
> ConcurrentMergeScheduler.java:320)
>         at org.apache.lucene.index.ConcurrentMergeScheduler 
> $MergeThread.run(ConcurrentMergeScheduler.java:297)
> Caused by: java.lang.AssertionError: after mergeFields: fdx size
> mismatch: 65184 docs vs 521464 length in bytes of _c9.fdx
>         at org.apache.lucene.index.SegmentMerger.mergeFields 
> (SegmentMerger.java:347)
>         at org.apache.lucene.index.SegmentMerger.merge 
> (SegmentMerger.java:133)
>         at org.apache.lucene.index.IndexWriter.mergeMiddle 
> (IndexWriter.java:3852)
>         at org.apache.lucene.index.IndexWriter.merge 
> (IndexWriter.java:3504)
>         at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge 
> (ConcurrentMergeScheduler.java:211)
>         at org.apache.lucene.index.ConcurrentMergeScheduler 
> $MergeThread.run(ConcurrentMergeScheduler.java:266)
>
> Latest infostream attached.
>
>
> --
> Ian.
>
>
> On Tue, Mar 18, 2008 at 6:05 PM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>>
>>  Hi Ian,
>>
>>  Sheesh that's odd.  The SegmentMerger produced an .fdx file that is
>>  one document too short.
>>
>>  Can you run with this patch now, again applied to head of 2.3
>>  branch?  I just added another assert inside the loop that does the
>>  field merging.
>>
>>  I will scrutinize this code...
>>
>>  Mike
>>
>>
>>
>>
>>  Ian Lea wrote:
>>> Mike
>>>
>>>
>>> Patch applied and test re-run and picked up an assertion error this
>>> time:
>>>
>>> Exception in thread "Lucene Merge Thread #0"
>>> org.apache.lucene.index.MergePolicy$MergeException:
>>> java.lang.AssertionError: after mergeFields: fdx size mismatch:  
>>> 72357
>>> docs vs 578848 length in bytes of _3o.fdx
>>>         at
>>> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeExceptio 
>>> n(
>>> ConcurrentMergeScheduler.java:320)
>>>         at org.apache.lucene.index.ConcurrentMergeScheduler
>>> $MergeThread.run(ConcurrentMergeScheduler.java:297)
>>> Caused by: java.lang.AssertionError: after mergeFields: fdx size
>>> mismatch: 72357 docs vs 578848 length in bytes of _3o.fdx
>>>         at org.apache.lucene.index.SegmentMerger.mergeFields
>>> (SegmentMerger.java:342)
>>>         at org.apache.lucene.index.SegmentMerger.merge
>>> (SegmentMerger.java:133)
>>>         at org.apache.lucene.index.IndexWriter.mergeMiddle
>>> (IndexWriter.java:3852)
>>>         at org.apache.lucene.index.IndexWriter.merge
>>> (IndexWriter.java:3504)
>>>         at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge
>>> (ConcurrentMergeScheduler.java:211)
>>>         at org.apache.lucene.index.ConcurrentMergeScheduler
>>> $MergeThread.run(ConcurrentMergeScheduler.java:266)
>>>
>>> The infostream output is attached.  Since this email is to you  
>>> and the
>>> list it should make it to you.
>>>
>>>
>>>
>>> Yonik: I haven't been able to make TestStressIndexing2 fail.
>>>
>>>
>>> --
>>> Ian.
>>>
>>>
>>> On Tue, Mar 18, 2008 at 4:19 PM, Michael McCandless
>>> <lucene@mikemccandless.com> wrote:
>>>>
>>>>  Ian,
>>>>
>>>>  Could you apply the attached patch applied to the head of the 2.3
>>>>  branch?
>>>>
>>>>  It only adds more asserts, to try to pinpoint where exactly this
>>>>  corruption starts.
>>>>
>>>>  Then, re-run the test with asserts enabled and infoStream  
>>>> turned on
>>>>  and post back.  Thanks.
>>>>
>>>>  Mike
>>>>
>>>>
>>>>
>>>>
>>>>  Ian Lea wrote:
>>>>
>>>>> It's failed on servers running SuSE 10.0 and 8.2 (ancient!)
>>>>>
>>>>> $ uname -a shows
>>>>> Linux phoebe 2.6.13-15-smp #1 SMP Tue Sep 13 14:56:15 UTC 2005
>>>>> x86_64
>>>>> x86_64 x86_64 GNU/Linux
>>>>>
>>>>> and
>>>>>
>>>>> Linux phobos 2.4.20-64GB-SMP #1 SMP Mon Mar 17 17:56:03 UTC 2003
>>>>> i686
>>>>> unknown unknown GNU/Linux
>>>>>
>>>>> The first one has a 2.8Ghz Intel CPU, don't know about the second.
>>>>>
>>>>>
>>>>> I'll try and run the stress test.
>>>>>
>>>>>
>>>>> --
>>>>> Ian.
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Mar 18, 2008 at 2:17 PM, Yonik Seeley <yonik@apache.org>
>>>>> wrote:
>>>>>>
>>>>>> On Tue, Mar 18, 2008 at 7:38 AM, Ian Lea <ian.lea@gmail.com>
 
>>>>>> wrote:
>>>>>>> Hi
>>>>>>>
>>>>>>>
>>>>>>>  When bulk loading into a new index I'm seeing this exception
>>>>>>>
>>>>>>>  Exception in thread "Thread-1"
>>>>>>>  org.apache.lucene.index.MergePolicy$MergeException:
>>>>>>>  org.apache.lucene.index.CorruptIndexException: doc counts  
>>>>>>> differ
>>>>>>> for
>>>>>>>  segment _4l: fieldsReader shows 67861 but segmentInfo shows
 
>>>>>>> 67862
>>>>>>>         at org.apache.lucene.index.ConcurrentMergeScheduler
>>>>>>> $MergeThread.run(ConcurrentMergeScheduler.java:271)
>>>>>>>  Caused by: org.apache.lucene.index.CorruptIndexException: doc
>>>>>>> counts
>>>>>>>  differ for segment _4l: fieldsReader shows 67861 but  
>>>>>>> segmentInfo
>>>>>>> shows
>>>>>>>  67862
>>>>>>>         at org.apache.lucene.index.SegmentReader.initialize
>>>>>>> (SegmentReader.java:313)
>>>>>>>         at org.apache.lucene.index.SegmentReader.get
>>>>>>> (SegmentReader.java:262)
>>>>>>>         at org.apache.lucene.index.SegmentReader.get
>>>>>>> (SegmentReader.java:221)
>>>>>>>         at org.apache.lucene.index.IndexWriter.mergeMiddle
>>>>>>> (IndexWriter.java:3093)
>>>>>>>         at org.apache.lucene.index.IndexWriter.merge
>>>>>>> (IndexWriter.java:2834)
>>>>>>>         at org.apache.lucene.index.ConcurrentMergeScheduler
>>>>>>> $MergeThread.run(ConcurrentMergeScheduler.java:240)
>>>>>>>
>>>>>>>  when use java version 1.6.0_05-b13 or 1.6.0_04-b12 on linux,
 
>>>>>>> with
>>>>>>>  lucene 2.3.0 or 2.3.1 or lucene-core-2.3-SNAPSHOT from  
>>>>>>> yesterday.
>>>>>>>
>>>>>>>  With java version 1.6.0_03-b05 things work fine.
>>>>>>>
>>>>>>>  The exception happens a few hundred thousand documents into
the
>>>>>>> load.
>>>>>>>
>>>>>>>  A different program updating a different index with different
>>>>>>> data on
>>>>>>>  a different server gave a similar error on version 1.6.0_05-
>>>>>>> b13 and
>>>>>>>  lucene 2.3.0.
>>>>>>>
>>>>>>>  Any ideas?  Is this maybe a known issue or am I missing
>>>>>>> something obvious?
>>>>>>
>>>>>>  My guess is perhaps a thread safety bug, more likely in Lucene
>>>>>>  indexing code (less likely in the JVM or specific libc).
>>>>>>
>>>>>>  What Linux version are you using?
>>>>>>  What hardware are you running on (specifically, the CPU)?
>>>>>>
>>>>>>  If possible, it would be great if you could check out Lucene
>>>>>> trunk,
>>>>>>  crank up the iterations by modifying the TestStressIndexing2 and
>>>>>> maybe
>>>>>>  fiddle with some of the other parameters in
>>>>>>  TestStressIndexing2.testMultiConfig(), and see if you can get
>>>>>> it to
>>>>>>  fail.
>>>>>>
>>>>>>
>>>>>>  -Yonik
>>>>>>
>>>>>>
>>>>>> -----------------------------------------------------------------

>>>>>> --
>>>>>> --
>>>>>>
>>>>>>
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>  For additional commands, e-mail: java-user- 
>>>>>> help@lucene.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>> ------------------------------------------------------------------ 
>>>>> --
>>>>> -
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>
>>>>
>>>>
>>>> <infostream.zip>
>>
>>
>>
>> <infostream.zip>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message