lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "René Zöpnek" <Zoe...@gmx.de>
Subject Corrupt index (IndexOutOfBoundsException)
Date Mon, 23 Mar 2009 10:42:51 GMT
Hello,

I'm using Lucene 2.3.2 and had no problems untill now.

But now I got an corrupt index. When searching, a java.lang.OutOfMemoryError is thrown. I've
wrote the following test program:

private static void search(String index, String query) throws CorruptIndexException, IOException,
ParseException
{
	IndexReader reader = IndexReader.open(index);
	//reader.setTermInfosIndexDivisor(10);
	Collection col = Reader.getFieldNames(IndexReader.FieldOption.INDEXED);
	Iterator it = col.iterator();
	String[] fields = new String[col.size()];
	int i = 0;
	while(it.hasNext())
	{
		fields[i] = (String)it.next();
		System.out.println("field["+i+"]: "+fields[i]);
		i++;			
	}
	Analyzer analyzer = new StandardAnalyzer();
	MultiFieldQueryParser parser = new MultiFieldQueryParser(fields, analyzer);
	parser.setAllowLeadingWildcard(true);
	Query quer = parser.parse(query);
	System.out.println("Query: "+quer.toString());
	quer = quer.rewrite(reader);
	System.out.println("rewritten Query: "+quer.toString());
	reader.close();
}

If reader.setTermInfosIndexDivisor() is commented out, the stacktrace looks like this:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
	at org.apache.lucene.index.TermInfosReader.ensureIndexIsRead(TermInfosReader.java:155)
	at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:202)
	at org.apache.lucene.index.TermInfosReader.terms(TermInfosReader.java:277)
	at org.apache.lucene.index.SegmentReader.terms(SegmentReader.java:643)
	at org.apache.lucene.search.PrefixQuery.rewrite(PrefixQuery.java:42)
	at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:385)
	at diplom.lucene.Index.search(Index.java:59)
	at diplom.lucene.Index.main(Index.java:28)

The index has a size of 59 MB, so it is weird to get an OutOfMemoryException. So with reader.setTermInfosIndexDivisor()
set to 10, the stacktrace looks like:

java.lang.IndexOutOfBoundsException: Index: 103, Size: 54
	at java.util.ArrayList.RangeCheck(ArrayList.java:547)
	at java.util.ArrayList.get(ArrayList.java:322)
	at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:260)
	at org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java:249)
	at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:68)
	at org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:123)
	at org.apache.lucene.index.TermInfosReader.ensureIndexIsRead(TermInfosReader.java:159)
	at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:202)
	at org.apache.lucene.index.TermInfosReader.terms(TermInfosReader.java:277)
	at org.apache.lucene.index.SegmentReader.terms(SegmentReader.java:643)
	at org.apache.lucene.search.PrefixQuery.rewrite(PrefixQuery.java:42)
	at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:385)
	at diplom.lucene.Index.search(Index.java:59)
	at diplom.lucene.Index.main(Index.java:28)


CheckIndex prints the following:

Segments file=segments_1zrx5 numSegments=1 version=FORMAT_SHARED_DOC_STORE [Lucene 2.3]
  1 of 1: name=_5pa8 docCount=117378
    compound=true
    numFiles=1
    size (MB)=57,573
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [54 fields]
    test: terms, freq, prox...FAILED
    WARNING: would remove reference to this segment (-fix was not specified); full exception:
java.lang.IndexOutOfBoundsException: Index: 110, Size: 54
	at java.util.ArrayList.RangeCheck(ArrayList.java:547)
	at java.util.ArrayList.get(ArrayList.java:322)
	at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:260)
	at org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java:249)
	at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:68)
	at org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:123)
	at org.apache.lucene.index.CheckIndex.check(CheckIndex.java:182)
	at diplom.lucene.Index.check(Index.java:67)
	at diplom.lucene.Index.main(Index.java:28)

WARNING: 1 broken segments detected
WARNING: 117378 documents would be lost if -fix were specified

NOTE: would write new segments file [-fix was not specified]

Index correct: false



Recreating the index didn't solve the problem. And I have no idea for solving it, so every
help is greatly appreciated.

Thanks in advance.
Rene
-- 
Aufgepasst: Sind Ihre Daten beim Online-Banking auch optimal geschützt?
Jetzt absichern: https://homebanking.gmx.net/?mc=mail@footer.hb

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message