lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Read past EOF
Date Tue, 28 Apr 2009 14:15:16 GMT
Ugh, indeed FieldInfos fails to properly read 2.3.x indices if the
field name contains non-ascii characters.  I'll open an issue, make a
test case and work out a fix.  Hmm.

Thanks for raising this!

Mike

On Tue, Apr 28, 2009 at 7:53 AM, Mike Streeton
<mike.streeton@connexica.com> wrote:
> I have an index that works fine on Lucene 2.3.2 but fails to open in 2.4.1, it always
fails with an Read past EOF. The index does contain some field names with german umlaut characters
in
>
> Any ideas?
>
> Many Thanks
>
> Mike
>
> CheckIndex v2.3.2
>
>
> NOTE: testing will be more thorough if you run java with '-ea:org.apache.lucene', so
assertions are enabled
>
> Opening index @ C:/index/german
>
> Segments file=segments_9 numSegments=1 version=FORMAT_SHARED_DOC_STORE [Lucene 2.3]
>  1 of 1: name=_3 docCount=235535
>    compound=true
>    numFiles=1
>    size (MB)=301.684
>    no deletions
>    test: open reader.........OK
>    test: fields, norms.......OK [70 fields]
>    test: terms, freq, prox...OK [1475862 terms; 25448796 terms/docs pairs; 28642994
tokens]
>    test: stored fields.......OK [13560464 total field count; avg 57.573 fields per
doc]
>    test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields
per doc]
>
> No problems were detected with this index.
>
> CheckIndex v2.4.1
>
>
> NOTE: testing will be more thorough if you run java with '-ea:org.apache.lucene...',
so assertions are enabled
>
> Opening index @ C:/index/german
>
> Segments file=segments_9 numSegments=1 version=FORMAT_SHARED_DOC_STORE [Lucene 2.3]
>  1 of 1: name=_3 docCount=235535
>    compound=true
>    hasProx=true
>    numFiles=1
>    size (MB)=301.684
>    no deletions
>    test: open reader.........FAILED
>    WARNING: fixIndex() would remove reference to this segment; full exception:
> java.io.IOException: read past EOF
>      at org.apache.lucene.store.BufferedIndexInput.refill(Unknown Source)
>      at org.apache.lucene.store.BufferedIndexInput.readBytes(Unknown Source)
>      at org.apache.lucene.store.BufferedIndexInput.readBytes(Unknown Source)
>      at org.apache.lucene.store.IndexInput.readString(Unknown Source)
>      at org.apache.lucene.index.FieldInfos.read(Unknown Source)
>      at org.apache.lucene.index.FieldInfos.<init>(Unknown Source)
>      at org.apache.lucene.index.SegmentReader.initialize(Unknown Source)
>      at org.apache.lucene.index.SegmentReader.get(Unknown Source)
>      at org.apache.lucene.index.SegmentReader.get(Unknown Source)
>      at org.apache.lucene.index.CheckIndex.checkIndex(Unknown Source)
>      at org.apache.lucene.index.CheckIndex.main(Unknown Source)
>
> WARNING: 1 broken segments (containing 235535 documents) detected
> WARNING: would write new segments file, and 235535 documents would be lost, if -fix were
specified
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message