lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Busch <busch...@gmail.com>
Subject Re: index corruption with latest lucene
Date Mon, 05 May 2008 21:32:36 GMT
If that is the case then I will go ahead and publish the 2.3.2 release? 
Have you seen this on 2.3.x, Mark?

-Michael

Michael McCandless wrote:
> 
> Actually that stack trace looks like it's from trunk, not from 
> 2.3.2(pre)?  OK, I think you said it's from "post 2.3 trunk".
> 
> Another question: is autoCommit false or true?
> 
> More responses below:
> 
> Mark Miller wrote:
>> On Mon, 2008-05-05 at 16:32 -0400, Michael McCandless wrote:
>>> Hi Mark,
>>>
>>> Not good!
>>>
>>> Can you describe how this index was created?  Did you use multiple
>>> threads on one IndexWriter?  Multiple sessions of IndexWriter
>>> appending to the index?  addIndexes*?  Is the index copied from one
>>> place to another after being written and before being searched?
>>
>> Both sites were created by a single thread on a single IndexWriter.
>> Updates are done through multiple threads and one IndexWriter. No
>> addIndexes. Index was never copied, always same path.
>>
>>>
>>> If you run CheckIndex, what does it report?
>>
>> This was my next move...unfortunately, someone accidentally kicked off a
>> complete reindex before I could do it. From what I can tell by the stack
>> trace, its a per doc problem...I am guessing I could have  printed the
>> ids of the problem docs and just reindex those? I have to deal with this
>> at many other sites, so that may be my attack...I cannot reindex
>> everything to fix.
> 
> It would be great to know if that workaround works (and indeed it's a 
> per-doc issue).  I'd also love to know how many docs are affected, when 
> you hit this.
> 
> If there's any way to zip up the index and send it to me, even just the 
> files for the one segment that has the corrupted doc, that'd be great.
> 
>>>
>>> Any prior exceptions on this index?
>>
>> Not that I can recall. One of the indexes was made months ago, prob with
>> a 2.0 or 2.1 Lucene, the second was made with a post 2.2 Lucene. One
>> site was windows 2003, the other AIX. One site was only 30,000 docs, the
>> other over 1 million.
>>
>>>
>>> Are your docs a variable schema (different fields)?
>>
>> Yes. Lots of different fields depending on the doc.
>>
>>>
>>> Mike
>>
>> Thanks Mike. I am currently trying to duplicate this. I can't go to
>> another site without testing some kind of fix.
>>
>>>
>>> Mark Miller wrote:
>>>> Yeah, its pretty close to 2.3.2, but I think from last week mabye.
>>>>
>>>> I finally have one of the stack traces (this comes on the tail
>>>> complete
>>>> laptop failure so I am scrambling here)
>>>>
>>>> java.lang.IndexOutOfBoundsException: Index: 97, Size: 43
>>>>         at java.util.ArrayList.RangeCheck(ArrayList.java:572)
>>>>         at java.util.ArrayList.get(ArrayList.java:347)
>>>>         at org.apache.lucene.index.FieldInfos.fieldInfo
>>>> (FieldInfos.java:260)
>>>>         at org.apache.lucene.index.FieldsReader.doc
>>>> (FieldsReader.java:184)
>>>>         at org.apache.lucene.index.SegmentReader.document
>>>> (SegmentReader.java:670)
>>>>         at org.apache.lucene.index.MultiSegmentReader.document
>>>> (MultiSegmentReader.java:257)
>>>>         at org.apache.lucene.search.IndexSearcher.doc
>>>> (IndexSearcher.java:97)
>>>>
>>>> On Mon, 2008-05-05 at 14:48 -0500, crspan wrote:
>>>>> coincidence or it is from 2.3.2 ?
>>>>>
>>>>> env:
>>>>> lucene 2.3.2
>>>>> jdk1.6.0_06 & jdk1.5.0_15
>>>>>
>>>>>
>>>>> QueryString:
>>>>> illeg^30.820824 technolog^22.290413 transfer^33.307804
>>>>> Error: java.lang.ArrayIndexOutOfBoundsException:
>>>>> 132704java.lang.ArrayIndexOutOfBoundsException: 132704
>>>>> at
>>>>> org.apache.lucene.search.BooleanScorer2$Coordinator.coordFactor
>>>>> (BooleanScorer2.java:55)
>>>>> at org.apache.lucene.search.BooleanScorer2.score
>>>>> (BooleanScorer2.java:358)
>>>>> at org.apache.lucene.search.BooleanScorer2.score
>>>>> (BooleanScorer2.java:320)
>>>>> at org.apache.lucene.search.IndexSearcher.search
>>>>> (IndexSearcher.java:146)
>>>>> at org.apache.lucene.search.IndexSearcher.search
>>>>> (IndexSearcher.java:113)
>>>>> at org.apache.lucene.search.Searcher.search(Searcher.java:132)
>>>>> at
>>>>> org.cr.search.TrecQueryRelevanceFeedback.main
>>>>> (TrecQueryRelevanceFeedback.java:776)
>>>>>
>>>>>
>>>>> QueryString:
>>>>> oceanograph^68.48028 vessel^43.191563
>>>>> Error:
>>>>> java.lang.ArrayIndexOutOfBoundsExceptionjava.lang.ArrayIndexOutOfBoun
>>>>> dsException
>>>>> at java.lang.System.arraycopy(Native Method)
>>>>> at
>>>>> org.apache.lucene.index.TermVectorsReader.readTermVector
>>>>> (TermVectorsReader.java:353)
>>>>> at
>>>>> org.apache.lucene.index.TermVectorsReader.readTermVectors
>>>>> (TermVectorsReader.java:287)
>>>>> at org.apache.lucene.index.TermVectorsReader.get
>>>>> (TermVectorsReader.java:232)
>>>>> at
>>>>> org.apache.lucene.index.SegmentReader.getTermFreqVectors
>>>>> (SegmentReader.java:981)
>>>>> at org.cr.rf.RelevanceFeedback.RelFeedbackWeight
>>>>> (RelevanceFeedback.java:134)
>>>>> at
>>>>> org.cr.search.TrecQueryRelevanceFeedback.main
>>>>> (TrecQueryRelevanceFeedback.java:781)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Mark Miller wrote:
>>>>>> Any recent changes that would expose index corruption?
>>>>>>
>>>>>> I am getting two new errors when trying to search:
>>>>>>
>>>>>> nullpointer fieldsreaders line 260
>>>>>>
>>>>>> indexoutofbounds on fieldinfo line 185
>>>>>>
>>>>>> I am kind of screwed, because reindexing fixes this, but I cant
>>>>>> reindex!
>>>>>>
>>>>>> Any ideas?
>>>>>>
>>>>>>
>>>>>> --------------------------------------------------------------------
>>>>>> -
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message