lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: index corruption with latest lucene
Date Mon, 05 May 2008 21:32:32 GMT

Also, if you can run your tests with assertions enabled, it could  
catch something...

Mike

Mark Miller wrote:
> On Mon, 2008-05-05 at 16:32 -0400, Michael McCandless wrote:
>> Hi Mark,
>>
>> Not good!
>>
>> Can you describe how this index was created?  Did you use multiple
>> threads on one IndexWriter?  Multiple sessions of IndexWriter
>> appending to the index?  addIndexes*?  Is the index copied from one
>> place to another after being written and before being searched?
>
> Both sites were created by a single thread on a single IndexWriter.
> Updates are done through multiple threads and one IndexWriter. No
> addIndexes. Index was never copied, always same path.
>
>>
>> If you run CheckIndex, what does it report?
>
> This was my next move...unfortunately, someone accidentally kicked  
> off a
> complete reindex before I could do it. From what I can tell by the  
> stack
> trace, its a per doc problem...I am guessing I could have  printed the
> ids of the problem docs and just reindex those? I have to deal with  
> this
> at many other sites, so that may be my attack...I cannot reindex
> everything to fix.
>
>>
>> Any prior exceptions on this index?
>
> Not that I can recall. One of the indexes was made months ago, prob  
> with
> a 2.0 or 2.1 Lucene, the second was made with a post 2.2 Lucene. One
> site was windows 2003, the other AIX. One site was only 30,000  
> docs, the
> other over 1 million.
>
>>
>> Are your docs a variable schema (different fields)?
>
> Yes. Lots of different fields depending on the doc.
>
>>
>> Mike
>
> Thanks Mike. I am currently trying to duplicate this. I can't go to
> another site without testing some kind of fix.
>
>>
>> Mark Miller wrote:
>>> Yeah, its pretty close to 2.3.2, but I think from last week mabye.
>>>
>>> I finally have one of the stack traces (this comes on the tail
>>> complete
>>> laptop failure so I am scrambling here)
>>>
>>> java.lang.IndexOutOfBoundsException: Index: 97, Size: 43
>>>         at java.util.ArrayList.RangeCheck(ArrayList.java:572)
>>>         at java.util.ArrayList.get(ArrayList.java:347)
>>>         at org.apache.lucene.index.FieldInfos.fieldInfo
>>> (FieldInfos.java:260)
>>>         at org.apache.lucene.index.FieldsReader.doc
>>> (FieldsReader.java:184)
>>>         at org.apache.lucene.index.SegmentReader.document
>>> (SegmentReader.java:670)
>>>         at org.apache.lucene.index.MultiSegmentReader.document
>>> (MultiSegmentReader.java:257)
>>>         at org.apache.lucene.search.IndexSearcher.doc
>>> (IndexSearcher.java:97)
>>>
>>> On Mon, 2008-05-05 at 14:48 -0500, crspan wrote:
>>>> coincidence or it is from 2.3.2 ?
>>>>
>>>> env:
>>>> lucene 2.3.2
>>>> jdk1.6.0_06 & jdk1.5.0_15
>>>>
>>>>
>>>> QueryString:
>>>> illeg^30.820824 technolog^22.290413 transfer^33.307804
>>>> Error: java.lang.ArrayIndexOutOfBoundsException:
>>>> 132704java.lang.ArrayIndexOutOfBoundsException: 132704
>>>> at
>>>> org.apache.lucene.search.BooleanScorer2$Coordinator.coordFactor
>>>> (BooleanScorer2.java:55)
>>>> at org.apache.lucene.search.BooleanScorer2.score
>>>> (BooleanScorer2.java:358)
>>>> at org.apache.lucene.search.BooleanScorer2.score
>>>> (BooleanScorer2.java:320)
>>>> at org.apache.lucene.search.IndexSearcher.search
>>>> (IndexSearcher.java:146)
>>>> at org.apache.lucene.search.IndexSearcher.search
>>>> (IndexSearcher.java:113)
>>>> at org.apache.lucene.search.Searcher.search(Searcher.java:132)
>>>> at
>>>> org.cr.search.TrecQueryRelevanceFeedback.main
>>>> (TrecQueryRelevanceFeedback.java:776)
>>>>
>>>>
>>>> QueryString:
>>>> oceanograph^68.48028 vessel^43.191563
>>>> Error:
>>>> java.lang.ArrayIndexOutOfBoundsExceptionjava.lang.ArrayIndexOutOfBo 
>>>> un
>>>> dsException
>>>> at java.lang.System.arraycopy(Native Method)
>>>> at
>>>> org.apache.lucene.index.TermVectorsReader.readTermVector
>>>> (TermVectorsReader.java:353)
>>>> at
>>>> org.apache.lucene.index.TermVectorsReader.readTermVectors
>>>> (TermVectorsReader.java:287)
>>>> at org.apache.lucene.index.TermVectorsReader.get
>>>> (TermVectorsReader.java:232)
>>>> at
>>>> org.apache.lucene.index.SegmentReader.getTermFreqVectors
>>>> (SegmentReader.java:981)
>>>> at org.cr.rf.RelevanceFeedback.RelFeedbackWeight
>>>> (RelevanceFeedback.java:134)
>>>> at
>>>> org.cr.search.TrecQueryRelevanceFeedback.main
>>>> (TrecQueryRelevanceFeedback.java:781)
>>>>
>>>>
>>>>
>>>>
>>>> Mark Miller wrote:
>>>>> Any recent changes that would expose index corruption?
>>>>>
>>>>> I am getting two new errors when trying to search:
>>>>>
>>>>> nullpointer fieldsreaders line 260
>>>>>
>>>>> indexoutofbounds on fieldinfo line 185
>>>>>
>>>>> I am kind of screwed, because reindexing fixes this, but I cant
>>>>> reindex!
>>>>>
>>>>> Any ideas?
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------ 
>>>>> --
>>>>> -
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message