lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Whitman" <br...@echonest.com>
Subject Re: background merge hit exception
Date Fri, 02 Jan 2009 21:02:39 GMT
Here's checkindex:

NOTE: testing will be more thorough if you run java with
'-ea:org.apache.lucene', so assertions are enabled

Opening index @ /vol/solr/data/index/

Segments file=segments_vxx numSegments=8 version=FORMAT_HAS_PROX [Lucene
2.4]
  1 of 8: name=_ks4 docCount=2504982
    compound=false
    hasProx=true
    numFiles=11
    size (MB)=3,965.695
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [343 fields]
    test: terms, freq, prox...OK [37238560 terms; 161527224 terms/docs
pairs; 186273362 tokens]
    test: stored fields.......OK [55813402 total field count; avg 22.281
fields per doc]
    test: term vectors........OK [7998458 total vector count; avg 3.193
term/freq vector fields per doc]

  2 of 8: name=_oaw docCount=514635
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=746.887
    has deletions [delFileName=_oaw_1rb.del]
    test: open reader.........OK [155528 deleted docs]
    test: fields, norms.......OK [172 fields]
    test: terms, freq, prox...OK [7396227 terms; 28146962 terms/docs pairs;
17298364 tokens]
    test: stored fields.......OK [5736012 total field count; avg 15.973
fields per doc]
    test: term vectors........OK [1045176 total vector count; avg 2.91
term/freq vector fields per doc]

  3 of 8: name=_tll docCount=827949
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=761.782
    has deletions [delFileName=_tll_2fs.del]
    test: open reader.........OK [39283 deleted docs]
    test: fields, norms.......OK [180 fields]
    test: terms, freq, prox...OK [10925397 terms; 43361019 terms/docs pairs;
42123294 tokens]
    test: stored fields.......OK [8673255 total field count; avg 10.997
fields per doc]
    test: term vectors........OK [880272 total vector count; avg 1.116
term/freq vector fields per doc]

  4 of 8: name=_tdx docCount=18372
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=56.856
    has deletions [delFileName=_tdx_9.del]
    test: open reader.........OK [18368 deleted docs]
    test: fields, norms.......OK [50 fields]
    test: terms, freq, prox...OK [261974 terms; 2018842 terms/docs pairs;
150 tokens]
    test: stored fields.......OK [76 total field count; avg 19 fields per
doc]
    test: term vectors........OK [14 total vector count; avg 3.5 term/freq
vector fields per doc]

  5 of 8: name=_te8 docCount=19929
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=60.475
    has deletions [delFileName=_te8_a.del]
    test: open reader.........OK [19900 deleted docs]
    test: fields, norms.......OK [72 fields]
    test: terms, freq, prox...OK [276045 terms; 2166958 terms/docs pairs;
1196 tokens]
    test: stored fields.......OK [522 total field count; avg 18 fields per
doc]
    test: term vectors........OK [132 total vector count; avg 4.552
term/freq vector fields per doc]

  6 of 8: name=_tej docCount=22201
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=65.827
    has deletions [delFileName=_tej_o.del]
    test: open reader.........OK [22171 deleted docs]
    test: fields, norms.......OK [50 fields]
    test: terms, freq, prox...FAILED
    WARNING: would remove reference to this segment (-fix was not
specified); full exception:
java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 34950
at org.apache.lucene.util.BitVector.get(BitVector.java:91)
at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:125)
at
org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:98)
at org.apache.lucene.index.CheckIndex.check(CheckIndex.java:222)
at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:433)

  7 of 8: name=_1agw docCount=1717926
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=2,390.413
    has deletions [delFileName=_1agw_1.del]
    test: open reader.........OK [1 deleted docs]
    test: fields, norms.......OK [438 fields]
    test: terms, freq, prox...OK [20959015 terms; 101603282 terms/docs
pairs; 123561985 tokens]
    test: stored fields.......OK [26248407 total field count; avg 15.279
fields per doc]
    test: term vectors........OK [4911368 total vector count; avg 2.859
term/freq vector fields per doc]

  8 of 8: name=_1agz docCount=1
    compound=false
    hasProx=true
    numFiles=8
    size (MB)=0
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [6 fields]
    test: terms, freq, prox...OK [6 terms; 6 terms/docs pairs; 6 tokens]
    test: stored fields.......OK [6 total field count; avg 6 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]

WARNING: 1 broken segments detected
WARNING: 30 documents would be lost if -fix were specified

NOTE: would write new segments file [-fix was not specified]



On Fri, Jan 2, 2009 at 3:47 PM, Brian Whitman <brian@echonest.com> wrote:

> I will but I bet I can guess what happened -- this index has many
> duplicates in it as well (same uniqueKey id multiple times) - this happened
> to us once before and it was because the solr server went down during an
> add. We may have to re-index, but I will run checkIndex now. Thanks
> (Thread for dupes here :
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200803.mbox/%3c4ED8C459-1B0F-41CC-986C-4FFCEEF82E55@variogr.am%3e)
>
>
> On Fri, Jan 2, 2009 at 3:44 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> It looks like your index has some kind of corruption.  Were there any
>> other
>> exceptions prior to this one, or, any previous problems with the OS/IO
>> system?
>>
>> Can you run CheckIndex (java org.apache.lucene.index.CheckIndex to see
>> usage) and post the output?
>> Mike
>>
>> Brian Whitman <brian@echonest.com> wrote:
>>
>> > I am getting this on a 10GB index (via solr 1.3) during an optimize:
>> > Jan 2, 2009 6:51:52 PM org.apache.solr.common.SolrException log
>> > SEVERE: java.io.IOException: background merge hit exception:
>> _ks4:C2504982
>> > _oaw:C514635 _tll:C827949 _tdx:C18372 _te8:C19929 _tej:C22201
>> > _1agw:C1717926
>> > _1agz:C1 into _1ah2 [optimize]
>> > at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2346)
>> > at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2280)
>> > at
>> >
>> >
>> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:355)
>> > at
>> >
>> >
>> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:77)
>> > ...
>> >
>> > Exception in thread "Lucene Merge Thread #2"
>> > org.apache.lucene.index.MergePolicy$MergeException:
>> > java.lang.ArrayIndexOutOfBoundsException: Array index out of range:
>> 34950
>> > at
>> >
>> >
>> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:314)
>> > at
>> >
>> >
>> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
>> > Caused by: java.lang.ArrayIndexOutOfBoundsException: Array index out of
>> > range: 34950
>> > at org.apache.lucene.util.BitVector.get(BitVector.java:91)
>> > at
>> org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:125)
>> > at
>> >
>> >
>> org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:98)
>> > ...
>> >
>> >
>> > Does anyone know how this is caused and how I can fix it? It happens
>> with
>> > every optimize. Commits were very slow on this index as well (40x as
>> slow
>> > as
>> > a similar index on another machine) I have plenty of disk space (many
>> 100s
>> > of GB) free.
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message