lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: Read DocValue twice
Date Tue, 20 Feb 2018 13:46:45 GMT
I can't see any attachements, did you forget to attach these files?

Le mar. 20 févr. 2018 à 11:30, Vadim Gindin <vgindin@detectum.com> a écrit :

> Adrien, would you be so kind to look at my Query/Weight/Scorer. I'm
> attaching those files containing them. I've removed unnecessary code like
> initialization, toString, equals and so on. Finally, how would iterator
> work correctly - it would be allowed to navigate through it twice - in
> score() and in explanation(). Isn't it?
>
> Regards,
> Vadim Gindin
>
> On Mon, Feb 19, 2018 at 10:55 PM, Adrien Grand <jpountz@gmail.com> wrote:
>
>> Yes, this is the problem. This doc ID is a special sentinel value that
>> means that the iterator is exhausted. I don't have enough context to know
>> what the exact problem is but there is a bug in your custom query.
>>
>> Le lun. 19 févr. 2018 à 16:07, Vadim Gindin <vgindin@detectum.com> a
>> écrit :
>>
>> > I have the scorer that is similar to DisjunctionScorer.java with
>> >
>> > private final DisiPriorityQueue subScorers;
>> > private final DisjunctionDISIApproximation approximation;
>> >
>> > They are initialized in a constructor like that:
>> >
>> >    this.subScorers = new DisiPriorityQueue(subScorers.size());
>> >    for (Scorer scorer : subScorers) {
>> >        final DisiWrapper w = new DisiWrapper(scorer);
>> >        this.subScorers.add(w);
>> >    }
>> >    this.approximation = new
>> DisjunctionDISIApproximation(this.subScorers);
>> >
>> >
>> >
>> > I use them in score() and in explain(). In explain() I do
>> >
>> >    this.approximation.advance(doc);
>> >
>> > And further the same code as in score(). I've also added logging. And
>> > here is the one string:
>> >
>>
> > explain: doc=2147483647 <(214)%20748-3647> <(214)%20748-3647>,
>> field=params, maxDoc=67649
>
>
>> >
>> > doc looks not so good..
>> >
>> >
>> > On Mon, Feb 19, 2018 at 7:32 PM, Adrien Grand <jpountz@gmail.com>
>> wrote:
>> >
>> > > Can you add some debug logging to see what the values of topList.doc
>> and
>> > > reader.maxDoc() are before before you call advanceExact?
>> > >
>> > > What do you mean by "I reuse the same DisiPriorityQueue of scorers in
>> > > score() and explain()". This shouldn't be possible.
>> > >
>> > > Le lun. 19 févr. 2018 à 15:23, Vadim Gindin <vgindin@detectum.com>
a
>> > > écrit :
>> > >
>> > > > I use these calls in both cases. In score() and explain() I have the
>> > > > following code:
>> > > >
>> > > > SortedNumericDocValues numDocVal =
>> DocValues.getSortedNumeric(reader,
>> > > > fieldName);
>> > > > if (numDocVal != null && numDocVal.advanceExact(topList.doc))
{
>> > > >     long val = numDocVal.nextValue();
>> > > >
>> > > >     ..
>> > > > }
>> > > >
>> > > > I reuse the same DisiPriorityQueue of scorers in score() and
>> explain().
>> > > >
>> > > > On Mon, Feb 19, 2018 at 6:54 PM, Adrien Grand <jpountz@gmail.com>
>> > wrote:
>> > > >
>> > > > > If you want to read the values again, you need to call setDocument
>> > > > (Lucene
>> > > > > < 7.0) or advanceExact (Lucene >= 7.0) before calling nextValue().
>> > > > >
>> > > > > Le lun. 19 févr. 2018 à 14:41, Vadim Gindin <vgindin@detectum.com>
>> a
>> > > > > écrit :
>> > > > >
>> > > > > > Hi all
>> > > > > >
>> > > > > > I use DocValue for scoring function. I.e. I have some column
>> with
>> > > > > integers,
>> > > > > > that are used in scoring formula. So I have a scorer that
>> > calculates
>> > > > > > scoring function twice:
>> > > > > > - in score()
>> > > > > > - in explain()
>> > > > > >
>> > > > > > I got the following error in explain:
>> > > > > >
>> > > > > > Caused by: java.lang.IndexOutOfBoundsException
>> > > > > >         at java.nio.Buffer.checkIndex(Buffer.java:540)
>> > > ~[?:1.8.0_161]
>> > > > > >         at
>> java.nio.DirectByteBuffer.get(DirectByteBuffer.java:253)
>> > > > > > ~[?:1.8.0_161]
>> > > > > >         at
>> > > > > > org.apache.lucene.store.ByteBufferGuard.getByte(
>> > > > > ByteBufferGuard.java:118)
>> > > > > > ~[lucene-core-7.1.0.jar:7.1.0 84c90ad2c0218156c840e19a64d72b
>> > > 8a38550659
>> > > > -
>> > > > > > ubuntu - 2017-10-13 16:12:42]
>> > > > > >         at
>> > > > > >
>> > > > > > org.apache.lucene.store.ByteBufferIndexInput$
>> > > SingleBufferImpl.readByte(
>> > > > > ByteBufferIndexInput.java:385)
>> > > > > > ~[lucene-core-7.1.0.jar:7.1.0 84c90ad2c0218156c840e19a64d72b
>> > > 8a38550659
>> > > > -
>> > > > > > ubuntu - 2017-10-13 16:12:42]
>> > > > > >         at
>> > > > > >
>> > > > > >
>> org.apache.lucene.util.packed.DirectReader$DirectPackedReader8.get(
>> > > > > DirectReader.java:145)
>> > > > > > ~[lucene-core-7.1.0.jar:7.1.0 84c90ad2c0218156c840e19a64d72b
>> > > 8a38550659
>> > > > -
>> > > > > > ubuntu - 2017-10-13 16:12:42]
>> > > > > >         at
>> > > > > >
>> > > > > >
>> > > >
>> > org.apache.lucene.codecs.lucene70.Lucene70DocValuesProducer$3.longValue(
>> > > > > Lucene70DocValuesProducer.java:481)
>> > > > > > ~[lucene-core-7.1.0.jar:7.1.0 84c90ad2c0218156c840e19a64d72b
>> > > 8a38550659
>> > > > -
>> > > > > > ubuntu - 2017-10-13 16:12:42]
>> > > > > >         at
>> > > > > >
>> > > > > >
>> org.apache.lucene.index.SingletonSortedNumericDocValues.nextValue(
>> > > > > SingletonSortedNumericDocValues.java:73)
>> > > > > > ~[lucene-core-7.1.0.jar:7.1.0 84c90ad2c0218156c840e19a64d72b
>> > > 8a38550659
>> > > > -
>> > > > > > ubuntu - 2017-10-13 16:12:42]
>> > > > > >
>> > > > > > I've found the following comment in the source code of
>> > > > > > SortedNumericDocValues.java:
>> > > > > >
>> > > > > > /**
>> > > > > >  * Iterates to the next value in the current document. 
Do not
>> call
>> > > > > > this more than {@link #docValueCount} times
>> > > > > >  * for the document.
>> > > > > >  */
>> > > > > >
>> > > > > > public abstract long nextValue() throws IOException;
>> > > > > >
>> > > > > >
>> > > > > > Questions:
>> > > > > > 1) Why I can't read the values twice?
>> > > > > > 2) How can I manage this situation?
>> > > > > > 3) Can it work for NumericDocValues?
>> > > > > >
>> > > > > > Regards,
>> > > > > > Vadim Gindin
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message