lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tony Ma <...@opentext.com>
Subject Re: [EXTERNAL] - Lucene 4.5.1 payload corruption - ArrayIndexOutOfBoundsException
Date Fri, 02 Feb 2018 14:19:33 GMT
Thanks Rebert.

We are not going to use merge to repair corrupted index, the issue we are seeing is that as
a segment is already got corrupted, but merges usually run automatically in background, I
am trying to know that when this scenario occurs, will merge stop with an exception or will
merge complete with a new corrupted segment.

To be specific, we got a corrupted segment with following check index output,
  1 of 5: name=_0 docCount=8341939
    codec=Lucene45
    compound=false
    numFiles=48
    size (MB)=16,446.275
    diagnostics = {os=Windows Server 2008 R2, java.vendor=Oracle Corporation, java.version=1.7.0_80,
lucene.version=4.5.1 1533280 - mark - 2013-10-17 21:37:01, mergeMaxNumSegments=5, os.arch=amd64,
source=merge, mergeFactor=6, timestamp=1514627603337, os.version=6.1}
    has deletions [delGen=130]
    test: open reader.........OK [4022 deleted docs]
    test: fields..............OK [268 fields]
    test: field norms.........OK [3 fields]
    test: terms, freq, prox...ERROR: java.lang.ArrayIndexOutOfBoundsException: 105
java.lang.ArrayIndexOutOfBoundsException: 105
	at org.apache.lucene.codecs.lucene41.ForUtil.readBlock(ForUtil.java:196)
	at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$EverythingEnum.refillPositions(Lucene41PostingsReader.java:1284)
	at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$EverythingEnum.skipPositions(Lucene41PostingsReader.java:1505)
	at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$EverythingEnum.nextPosition(Lucene41PostingsReader.java:1548)
	at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:979)
	at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1232)
	at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:623)
	at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:372)


In checkindex, it will first check each position (all pass) and then do a skip test(fail),
and corruption seems to appear at skiplist. I am wondering at this special case, it is possible
that merge reconstruct a new skiplist because each position is fine. 

So that at least I can know this segment is newly corrupted one or it is previous corrupted
and merge to a new one.


On 2/2/18, 9:58 PM, "Robert Muir" <rcmuir@gmail.com> wrote:

    IMO this is not something you want to do.
    
    The only remedy CheckIndex has for a corrupted segment is to drop it
    completely: and if you choose to do that then you lose all the
    documents in that segment. So its not very useful to merge it with
    other segments into bigger corrupted segments since it will just
    spread more corruption.
    
    On Fri, Feb 2, 2018 at 3:08 AM, Tony Ma <tma@opentext.com> wrote:
    > Hi experts,
    >
    > A question to corrupted index. If an index segment is already corrupted, can it be
merged with another segment. Or it depends on where it got corrupted, for example corrupted
in .pay file?
    >
    > From: 马江 <tma@opentext.com>
    > Date: Friday, January 19, 2018 at 9:52 AM
    > To: "java-user@lucene.apache.org" <java-user@lucene.apache.org>
    > Subject: Re: [EXTERNAL] - Lucene 4.5.1 payload corruption - ArrayIndexOutOfBoundsException
    >
    > Hi experts,
    >
    > Still about this issue, is there any known bug which will cause payload file corruption?
The stack trace indicates that the fisrt byte of input should be an Integer <= 32, but
actually it is 110.
    > Our customers seeing this kind of corruption several times, and all of the corruption
is from payload. Is there any possibility that the bytes put into payload being incompatible
with payload codec?
    >
    >
    >   void readBlock(IndexInput in, byte[] encoded, int[] decoded) throws IOException
{
    >     final int numBits = in.readByte();
    >     assert numBits <= 32 : numBits;
    >
    >     if (numBits == ALL_VALUES_EQUAL) {
    >       final int value = in.readVInt();
    >       Arrays.fill(decoded, 0, BLOCK_SIZE, value);
    >       return;
    >     }
    >
    >     final int encodedSize = encodedSizes[numBits];
    >     in.readBytes(encoded, 0, encodedSize);
    >
    >
    > From: 马江 <tma@opentext.com>
    > Reply-To: "java-user@lucene.apache.org" <java-user@lucene.apache.org>
    > Date: Tuesday, January 16, 2018 at 11:16 AM
    > To: "java-user@lucene.apache.org" <java-user@lucene.apache.org>
    > Subject: [EXTERNAL] - Lucene 4.5.1 payload corruption - ArrayIndexOutOfBoundsException
    >
    > Hi experts,
    >
    > Recently one of our customer continuously seeing ArrayIndexOutOfBoundsException which
is thrown from Lucene.
    >
    > Our production is full-text search engine built on top of Lucene, following is the
stack traces. The customer saying that they can reproduce the issue even after re-index everything
from scratch.
    >
    > Caused by: java.lang.ArrayIndexOutOfBoundsException: 110
    >                 at org.apache.lucene.codecs.lucene41.ForUtil.readBlock(ForUtil.java:196)
    >                 at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$EverythingEnum.refillPositions(Lucene41PostingsReader.java:1284)
    >                 at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$EverythingEnum.skipPositions(Lucene41PostingsReader.java:1505)
    >                 at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$EverythingEnum.nextPosition(Lucene41PostingsReader.java:1548)
    >                 at org.apache.lucene.search.spans.TermSpans.skipTo(TermSpans.java:82)
    >                 at org.apache.lucene.search.spans.SpanScorer.advance(SpanScorer.java:63)
    >                 at org.apache.lucene.search.ConjunctionScorer.doNext(ConjunctionScorer.java:69)
    >                 at org.apache.lucene.search.ConjunctionScorer.nextDoc(ConjunctionScorer.java:100)
    >                 at org.apache.lucene.search.Scorer.score(Scorer.java:64)
    >                 at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:627)
    >                 at com.xhive.lucene.executor.f.a(xdb:158)
    >                 at com.xhive.lucene.executor.f.search(xdb:145)
    >                 at com.xhive.lucene.subpath.e.a(xdb:313)
    >                 at com.xhive.lucene.subpath.e.a(xdb:264)
    >                 at com.xhive.lucene.subpath.e.a(xdb:183)
    >                 at com.xhive.lucene.executor.v.executeExternally(xdb:253)
    >                 at com.xhive.kernel.ay.externalIndexExecute(xdb:2791)
    >                 at com.xhive.core.index.ExternalIndex.executeExternally(xdb:485)
    >                 at com.xhive.core.index.XhiveMultiPathIndex.a(xdb:306)
    >                 at com.xhive.xquery.pathexpr.v$a.ci(xdb:124)
    >                 at com.xhive.xquery.pathexpr.ad$a.cp(xdb:104)
    >                 at com.xhive.xquery.pathexpr.ax.awP(xdb:39)
    >                 at com.xhive.xquery.pathexpr.ax.<init>(xdb:32)
    >                 at com.xhive.xquery.pathexpr.av.a(xdb:424)
    >                 at com.xhive.xquery.pathexpr.al$a.awk(xdb:61)
    >                 at com.xhive.xquery.pathexpr.ag.awj(xdb:28)
    >                 at com.xhive.xquery.pathexpr.al.Xo(xdb:26)
    >                 at com.xhive.xquery.pathexpr.aj.<init>(xdb:33)
    >                 at com.xhive.xquery.pathexpr.al.<init>(xdb:20)
    >                 at com.xhive.xquery.pathexpr.av.a(xdb:462)
    >                 at com.xhive.xquery.pathexpr.av.a(xdb:413)
    >                 at com.xhive.xquery.pathexpr.av.a(xdb:276)
    >                 at com.xhive.xquery.pathexpr.av.a(xdb:220)
    >
    >
    > ==============================================================
    > following is CheckIndex output of corrupted segment. The full output is attached.
    >
    >
    > Checking consistency of: [CHECK_INDEXES_CONSISTENCY]
    > Library child /dpwprd/dsearch/Data/Collection2 is not in consistent state, errors
report:
    > ============================================================
    > Library child name=/dpwprd/dsearch/Data/Collection2 indexes
    > consistency report.
    > ============================================================
    > check external index consistency [database name: xhivedb;
    >
    > index name: dmftdoc; segment id:
    >
    > EI-0ab89c0c-2a9d-4fe2-97b9-5f0c96678f13-510173395289107-master;
    >
    > xhive index id id: 510173395289107]
    > check lucene indices
    >
    > fail: lucene index LI-0001cd61-342c-4cfe-9898-c293eb1c8c09
    >
    > is not consistent; Segments file=segments_2 numSegments=5
    >
    > version=4.5.1 format=
    >   1 of 5: name=_0 docCount=8341939
    >
    >
    >    codec=Lucene45
    >     compound=false
    >     numFiles=26
    >
    >
    > size (MB)=16,446.152
    >     diagnostics =
    >
    > {timestamp=1514627603337, mergeFactor=6, os.version=6.1,
    >
    > os=Windows Server 2008 R2, lucene.version=4.5.1 1533280 -
    >
    > mark - 2013-10-17 21:37:01, source=merge, os.arch=amd64,
    >
    > mergeMaxNumSegments=5, java.version=1.7.0_80,
    >
    > java.vendor=Oracle Corporation}
    >     has deletions
    >
    > [delGen=70]
    >     test: open reader.........OK [2295 deleted
    >
    > docs]
    >     test: fields..............OK [268 fields]
    >
    >
    > test: field norms.........OK [3 fields]
    >     test: terms,
    >
    > freq, prox...ERROR:
    >
    > java.lang.ArrayIndexOutOfBoundsException
    >
    >
    > java.lang.ArrayIndexOutOfBoundsException
    >     test: stored
    >
    > fields.......OK [16679288 total field count; avg 2 fields
    >
    > per doc]
    >     test: term vectors........OK [0 total vector
    >
    > count; avg 0 term/freq vector fields per doc]
    >     test:
    >
    > docvalues...........OK [0 docvalues fields; 0 BINARY; 0
    >
    > NUMERIC; 0 SORTED; 0 SORTED_SET]
    > FAILED
    >     WARNING:
    >
    > fixIndex() would remove reference to this segment; full
    >
    > exception:
    > java.lang.RuntimeException: Term Index test
    >
    > failed
    >                 at
    >
    > org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:638)
    >
    >
    >                 at
    >
    > org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:372)
    >
    >
    >                 at com.xhive.lucene.executor.j.a(xdb:1190)
    >                 at
    >
    > com.xhive.lucene.executor.j.aY(xdb:1166)
    >                 at
    >
    > com.xhive.lucene.executor.v.checkIndexConsistency(xdb:370)
    >
    >
    >                 at
    >
    > com.xhive.kernel.ay.externalIndexCheckConsistency(xdb:2523)
    >
    >
    >                 at com.xhive.kernel.bn.handleRequest(xdb:2544)
    >                 at
    >
    > com.xhive.kernel.bn.run(xdb:222)
    >                 at
    >
    > java.lang.Thread.run(Thread.java:745)
    >
    > ==============================================================
    >
    > The corrupted payload stores a serialized hashmap which contains several configurable
metadata which is used to sort by condition.
    > The field of the corrupted payload is single term field, so the structure of posting
looks like a sequence of payload.
    > We also put freshness boost value into payload in another field, which have no issues.
    >
    > It is the first customer report the corruption after we used Lucene 4.5.1 and released
our product for many years.
    >
    > Please let me know if you have any idea to this issue.
    >
    > Thanks,
    > Tony Ma(马江)
    >
    
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    
    


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Mime
View raw message