lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Corrupt index (IndexOutOfBoundsException)
Date Mon, 23 Mar 2009 12:43:14 GMT

Something appears to be wrong with your _X.tii file (inside the  
compound file).

Can you post the code that recreates this broken index?

Since it appears to be repeatable, could you regenerate your index  
with compound file off, confirm the problem still happens, and then  
post the _X.tii file?  I can try to look at it.

Mike

René Zöpnek wrote:

> Hello,
>
> I'm using Lucene 2.3.2 and had no problems untill now.
>
> But now I got an corrupt index. When searching, a  
> java.lang.OutOfMemoryError is thrown. I've wrote the following test  
> program:
>
> private static void search(String index, String query) throws  
> CorruptIndexException, IOException, ParseException
> {
> 	IndexReader reader = IndexReader.open(index);
> 	//reader.setTermInfosIndexDivisor(10);
> 	Collection col =  
> Reader.getFieldNames(IndexReader.FieldOption.INDEXED);
> 	Iterator it = col.iterator();
> 	String[] fields = new String[col.size()];
> 	int i = 0;
> 	while(it.hasNext())
> 	{
> 		fields[i] = (String)it.next();
> 		System.out.println("field["+i+"]: "+fields[i]);
> 		i++;			
> 	}
> 	Analyzer analyzer = new StandardAnalyzer();
> 	MultiFieldQueryParser parser = new MultiFieldQueryParser(fields,  
> analyzer);
> 	parser.setAllowLeadingWildcard(true);
> 	Query quer = parser.parse(query);
> 	System.out.println("Query: "+quer.toString());
> 	quer = quer.rewrite(reader);
> 	System.out.println("rewritten Query: "+quer.toString());
> 	reader.close();
> }
>
> If reader.setTermInfosIndexDivisor() is commented out, the  
> stacktrace looks like this:
>
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> 	at  
> org 
> .apache 
> .lucene.index.TermInfosReader.ensureIndexIsRead(TermInfosReader.java: 
> 155)
> 	at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java: 
> 202)
> 	at  
> org.apache.lucene.index.TermInfosReader.terms(TermInfosReader.java: 
> 277)
> 	at org.apache.lucene.index.SegmentReader.terms(SegmentReader.java: 
> 643)
> 	at org.apache.lucene.search.PrefixQuery.rewrite(PrefixQuery.java:42)
> 	at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java: 
> 385)
> 	at diplom.lucene.Index.search(Index.java:59)
> 	at diplom.lucene.Index.main(Index.java:28)
>
> The index has a size of 59 MB, so it is weird to get an  
> OutOfMemoryException. So with reader.setTermInfosIndexDivisor() set  
> to 10, the stacktrace looks like:
>
> java.lang.IndexOutOfBoundsException: Index: 103, Size: 54
> 	at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> 	at java.util.ArrayList.get(ArrayList.java:322)
> 	at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:260)
> 	at org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java:249)
> 	at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:68)
> 	at  
> org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:123)
> 	at  
> org 
> .apache 
> .lucene.index.TermInfosReader.ensureIndexIsRead(TermInfosReader.java: 
> 159)
> 	at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java: 
> 202)
> 	at  
> org.apache.lucene.index.TermInfosReader.terms(TermInfosReader.java: 
> 277)
> 	at org.apache.lucene.index.SegmentReader.terms(SegmentReader.java: 
> 643)
> 	at org.apache.lucene.search.PrefixQuery.rewrite(PrefixQuery.java:42)
> 	at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java: 
> 385)
> 	at diplom.lucene.Index.search(Index.java:59)
> 	at diplom.lucene.Index.main(Index.java:28)
>
>
> CheckIndex prints the following:
>
> Segments file=segments_1zrx5 numSegments=1  
> version=FORMAT_SHARED_DOC_STORE [Lucene 2.3]
>  1 of 1: name=_5pa8 docCount=117378
>    compound=true
>    numFiles=1
>    size (MB)=57,573
>    no deletions
>    test: open reader.........OK
>    test: fields, norms.......OK [54 fields]
>    test: terms, freq, prox...FAILED
>    WARNING: would remove reference to this segment (-fix was not  
> specified); full exception:
> java.lang.IndexOutOfBoundsException: Index: 110, Size: 54
> 	at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> 	at java.util.ArrayList.get(ArrayList.java:322)
> 	at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:260)
> 	at org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java:249)
> 	at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:68)
> 	at  
> org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:123)
> 	at org.apache.lucene.index.CheckIndex.check(CheckIndex.java:182)
> 	at diplom.lucene.Index.check(Index.java:67)
> 	at diplom.lucene.Index.main(Index.java:28)
>
> WARNING: 1 broken segments detected
> WARNING: 117378 documents would be lost if -fix were specified
>
> NOTE: would write new segments file [-fix was not specified]
>
> Index correct: false
>
>
>
> Recreating the index didn't solve the problem. And I have no idea  
> for solving it, so every help is greatly appreciated.
>
> Thanks in advance.
> Rene
> -- 
> Aufgepasst: Sind Ihre Daten beim Online-Banking auch optimal  
> geschützt?
> Jetzt absichern: https://homebanking.gmx.net/?mc=mail@footer.hb
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message