lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "袁武 [GMail]" <yuanwu.m...@gmail.com>
Subject Re: Re: A likely bug of TermsPosition.nextPosition
Date Fri, 01 Apr 2011 10:27:00 GMT
Hi, Dear Mike:

belows list the report of checkIndex. OS is Fedora Linux.

[oracle@server bin]$ java -classpath ./ org.apache.lucene.index.CheckIndex /data/Index/URL/Generic/
-fix

NOTE: testing will be more thorough if you run java with '-ea:org.apache.lucene...', so assertions
are enabled
Opening index @ /GeoGrid/data/Index/URL/Generic/
Segments file=segments_ayup numSegments=1 version=FORMAT_DIAGNOSTICS [Lucene 2.9]
  1 of 1: name=_bym0 docCount=1344278
    compound=false
    hasProx=true
    numFiles=11
    size (MB)=15,979.593
    diagnostics = {optimize=true, mergeFactor=3, os.version=2.6.27.41-170.2.117.fc10.x86_64,
os=Linux, mergeDocStores=true, lucene.version=3.0-dev, source=merge, os.arch=amd64, java.version=1.6.0_21,
java.vendor=Sun Microsystems Inc.}
    no deletions
    test: open reader.........OK
    test: fields..............OK [11 fields]
    test: field norms.........OK [2 fields]
    test: terms, freq, prox...
OK [1263500 terms; 601715072 terms/docs pairs; 1513780631 tokens]
    test: stored fields.......OK [8063151 total field count; avg 5.998 fields per doc]
    test: term vectors........OK [1344278 total vector count; avg 1 term/freq vector fields
per doc]
No problems were detected with this index.



2011-04-01 



袁武 [GMail] 



发件人: Michael McCandless 
发送时间: 2011-04-01  17:58:08 
收件人: java-user 
抄送: 袁武 [GMail] 
主题: Re: A likely bug of TermsPosition.nextPosition 
 
Hmm this could be from a corrupted index.
What version of Lucene?  What OS/filesystem?
Can you run CheckIndex and post the output?
Mike
http://blog.mikemccandless.com
2011/3/31 袁武 [GMail] <yuanwu.mail@gmail.com>:
> Hi, dear experts:
>
> When IndexReader.termsPositions is used to access specific terms, the call to TermsPosition.nextPosition
success if TermsPosition.next is used. But if TermsPosition.skipTo is used instead of TermsPosition.next,
a java.lang.IllegalArgumentException will be thrown, as bellows.
>
>  java.lang.IllegalArgumentException: Negative position
>    at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:610)
>    at org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:161)
>    at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:213)
>    at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
>    at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:92)
>    at org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:181)
>    at org.apache.lucene.index.SegmentTermPositions.readDeltaPosition(SegmentTermPositions.java:75)
>    at org.apache.lucene.index.SegmentTermPositions.skipPositions(SegmentTermPositions.java:130)
>    at org.apache.lucene.index.SegmentTermPositions.lazySkip(SegmentTermPositions.java:168)
>    at org.apache.lucene.index.SegmentTermPositions.nextPosition(SegmentTermPositions.java:69)
>
> In my further study, I found that if docid execeed 1044278, the exception occurs everytime,
for the small ones,  the exception never occur. BTW, the total number of documents is about
1344278, and are never deleted.
>
> Thanks.
>
> 2011-04-01
>
>
>
> Yuan Wu [GMail]
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message