lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cristian Vat <cristian....@gmail.com>
Subject Re: problem during index merge
Date Wed, 20 Oct 2010 20:19:28 GMT
Unfortunately my target index is fully optimized so only one segment.
So CheckIndex won't be useful.

I'll reindex from scratch and also check for hardware issues.
Hopefully it won't get corrupted again.

Thanks for your help.

-
Cristian Vat

On Wed, Oct 20, 2010 at 22:58, Michael McCandless
<lucene@mikemccandless.com> wrote:
> Not recoverable automatically.
>
> But you can run CheckIndex -fix -- it removes the segment(s) that has
> the corruption (losing all docs in that segment).
>
> Searches appear fine because we don't do this check ("docs out of
> order") during searching, but results are likely wrong when they hit
> that segment(s).
>
> Hardware issues can definitely cause this -- eg if RAM or IO system is
> flipping bits occasionally that can cause such corruption.
>
> But if it's repeatable on different machines then there's a bug here ;)
>
> Mike
>
> On Wed, Oct 20, 2010 at 3:37 PM, Cristian Vat <cristian.vat@gmail.com> wrote:
>> Corruption in the sense that it isn't recoverable or is there
>> something I might be able to do?
>>
>> The index opens up without problems in Luke and there's an application
>> using it for searching without problems (apparently)
>>
>> It's actually an application with a (possibly) hourly update, so
>> finding the last good index and when something might have broken won't
>> be possible.
>>
>> Could it be caused by a hardware problem? It's possible there were
>> some problems on the same server before. But I imagined it would be
>> completely broken (even for search) if it was a hardware issue.
>>
>> -
>> Cristian Vat
>>
>> On Wed, Oct 20, 2010 at 22:22, Michael McCandless
>> <lucene@mikemccandless.com> wrote:
>>> This looks like index corruption.
>>>
>>> But: are you able to reproduce this corruption, eg on different
>>> machines?  I'd love to see how :)
>>>
>>> Your usage (using addIndexesNoOptimize to add indices) looks fine.
>>>
>>> Mike
>>>
>>> On Wed, Oct 20, 2010 at 2:45 PM, Cristian Vat <cristian.vat@gmail.com>
wrote:
>>>> Hello,
>>>>
>>>> I've been running into a problem during a merge. Would appreciate
>>>> knowing what to look for since the exception doesn't seem too
>>>> explanatory.
>>>>
>>>> I get:
>>>> --
>>>> --- Nested Exception ---
>>>> java.io.IOException: background merge hit exception: _2p:c695204
>>>> _2q:c93106 into _2r [optimize] [mergeDocStores]
>>>>        at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2908)
>>>>        at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2843)
>>>>        at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2813)
>>>>        ...application code...
>>>> Caused by: org.apache.lucene.index.CorruptIndexException: docs out of
>>>> order (64570 <= 64570 )
>>>>        at org.apache.lucene.index.FormatPostingsDocsWriter.addDoc(FormatPostingsDocsWriter.java:75)
>>>>        at org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:703)
>>>>        at org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:648)
>>>>        at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:586)
>>>>        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:154)
>>>>        at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5112)
>>>>        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4675)
>>>>        at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:235)
>>>>        at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
>>>> --
>>>>
>>>> I have two indexes, first I delete some documents from the second one
>>>> and then use IndexWriter.addIndexesNoOptimize() and
>>>> IndexWriter.optimize() to add all of the first index to the second
>>>> one.
>>>> Everything works fine if I don't call optimize()
>>>> Using lucene 2.9.3. I get the same problem on windows and linux with
>>>> different java versions.
>>>>
>>>> Also, the same code ran without problems for some months, but today
>>>> with the indexes I have it consistently fails.
>>>>
>>>> I opened the target index with Luke and ran Check Index and I do get
>>>> an error but I don't know exactly what causes it:
>>>> --
>>>>  test: terms, freq, prox...ERROR [term fulltext:instructions: doc
>>>> 260628: pos -405712460 is out of bounds]
>>>> java.lang.RuntimeException: term fulltext:instructions: doc 260628:
>>>> pos -405712460 is out of bounds
>>>>        at org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:657)
>>>>        at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530)
>>>>        at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:319)
>>>>        at org.getopt.luke.Luke$6.run(Unknown Source)
>>>> --
>>>>
>>>> Thanks for any help.
>>>> -
>>>> Cristian Vat
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message