lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: revisit payloads API in DocsAndPositionsEnum
Date Sun, 12 Aug 2012 06:18:10 GMT
That would definitely be separate. I looked into this: the problem is
things like LUCENE-4219.

So the current behavior for payload-using span queries (at least
span-near with payloads) is wrong, they score differently depending
upon whether you next() or advance() them (which is horrible!), so I
completely don't trust any of the existing tests.

I think we need to either first fix the bugs in spans or do something
else with them before trying to simplify their APIs.

On Sun, Aug 12, 2012 at 2:10 AM, Shai Erera <serera@gmail.com> wrote:
> Looks good.
>
> Perhaps separately, what do you think about doing the same to
> Spans.isPayloadAvailable/getPayload?
>
> Shai
>
>
> On Sun, Aug 12, 2012 at 8:56 AM, Robert Muir <rcmuir@gmail.com> wrote:
>>
>> Here's a patch: http://pastebin.com/d2DdWxJp
>>
>> On Sat, Aug 11, 2012 at 1:35 PM, Simon Willnauer
>> <simon.willnauer@gmail.com> wrote:
>> > +1 this makes lots of sense
>> >
>> > simon
>> >
>> > On Sat, Aug 11, 2012 at 7:28 PM, Michael McCandless
>> > <lucene@mikemccandless.com> wrote:
>> >> +1
>> >>
>> >> Mike McCandless
>> >>
>> >> http://blog.mikemccandless.com
>> >>
>> >>
>> >> On Sat, Aug 11, 2012 at 10:08 AM, Robert Muir <rcmuir@gmail.com> wrote:
>> >>> The payloads api is really confusing:
>> >>>
>> >>>   /** Returns the payload at this position, or null if no
>> >>>    *  payload was indexed.  Only call this once per
>> >>>    *  position. You should not modify anything (neither
>> >>>    *  members of the returned BytesRef nor bytes in the
>> >>>    *  byte[]). */
>> >>>   public abstract BytesRef getPayload() throws IOException;
>> >>>
>> >>>   public abstract boolean hasPayload();
>> >>>
>> >>> 1. is it ok for the consumer to call getPayload() [checking for null],
>> >>> and never call hasPayload? It seems so, so why do we have hasPayload?
>> >>> can we remove it?
>> >>>     The current situation requires impls to handle 'hasPayload' logic
>> >>> twice: in hasPayload itself and also in getPayload.
>> >>>
>> >>> 2. You should not modify anything (neither members of the returned
>> >>> BytesRef nor bytes in the byte[]). Then why do we have this in
>> >>> TestPayloads.java:
>> >>>    // Just to ensure all codecs can
>> >>>    // handle a caller that mucks with the
>> >>>    // returned payload:
>> >>>    if (rarely()) {
>> >>>      br.bytes = new byte[random().nextInt(5)];
>> >>>    }
>> >>>    br.length = 0;
>> >>>    br.offset = 0;
>> >>>
>> >>>    Testing for this totally disagrees with the javadocs.
>> >>>
>> >>> 3. 'Only call this once per position'. This is totally different than
>> >>> any of our other 'attributes' on the enums (freq(), startOffset(),
>> >>> endOffset(), etc). I think we should
>> >>>     remove this.
>> >>>
>> >>> So I want to propose we remove hasPayload(), remove 'only call once
>> >>> per position', and remove this test in TestPayloads.java. If we want
>> >>> to make sure none
>> >>> of lucene's code itself is mucking with the returned BytesRef, we can
>> >>> add such a check in AssertingCodec.
>> >>>
>> >>>  /** Returns the payload at this position, or null if no
>> >>>    *  payload was indexed. You should not modify anything (neither
>> >>>    *  members of the returned BytesRef nor bytes in the
>> >>>    *  byte[]). */
>> >>> public abstract BytesRef getPayload() throws IOException;
>> >>>
>> >>> --
>> >>> lucidimagination.com
>> >>>
>> >>> ---------------------------------------------------------------------
>> >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: dev-help@lucene.apache.org
>> >
>>
>>
>>
>> --
>> lucidworks.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>



-- 
lucidworks.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message