lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: revisit payloads API in DocsAndPositionsEnum
Date Sun, 12 Aug 2012 05:56:05 GMT
Here's a patch: http://pastebin.com/d2DdWxJp

On Sat, Aug 11, 2012 at 1:35 PM, Simon Willnauer
<simon.willnauer@gmail.com> wrote:
> +1 this makes lots of sense
>
> simon
>
> On Sat, Aug 11, 2012 at 7:28 PM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>> +1
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Sat, Aug 11, 2012 at 10:08 AM, Robert Muir <rcmuir@gmail.com> wrote:
>>> The payloads api is really confusing:
>>>
>>>   /** Returns the payload at this position, or null if no
>>>    *  payload was indexed.  Only call this once per
>>>    *  position. You should not modify anything (neither
>>>    *  members of the returned BytesRef nor bytes in the
>>>    *  byte[]). */
>>>   public abstract BytesRef getPayload() throws IOException;
>>>
>>>   public abstract boolean hasPayload();
>>>
>>> 1. is it ok for the consumer to call getPayload() [checking for null],
>>> and never call hasPayload? It seems so, so why do we have hasPayload?
>>> can we remove it?
>>>     The current situation requires impls to handle 'hasPayload' logic
>>> twice: in hasPayload itself and also in getPayload.
>>>
>>> 2. You should not modify anything (neither members of the returned
>>> BytesRef nor bytes in the byte[]). Then why do we have this in
>>> TestPayloads.java:
>>>    // Just to ensure all codecs can
>>>    // handle a caller that mucks with the
>>>    // returned payload:
>>>    if (rarely()) {
>>>      br.bytes = new byte[random().nextInt(5)];
>>>    }
>>>    br.length = 0;
>>>    br.offset = 0;
>>>
>>>    Testing for this totally disagrees with the javadocs.
>>>
>>> 3. 'Only call this once per position'. This is totally different than
>>> any of our other 'attributes' on the enums (freq(), startOffset(),
>>> endOffset(), etc). I think we should
>>>     remove this.
>>>
>>> So I want to propose we remove hasPayload(), remove 'only call once
>>> per position', and remove this test in TestPayloads.java. If we want
>>> to make sure none
>>> of lucene's code itself is mucking with the returned BytesRef, we can
>>> add such a check in AssertingCodec.
>>>
>>>  /** Returns the payload at this position, or null if no
>>>    *  payload was indexed. You should not modify anything (neither
>>>    *  members of the returned BytesRef nor bytes in the
>>>    *  byte[]). */
>>> public abstract BytesRef getPayload() throws IOException;
>>>
>>> --
>>> lucidimagination.com
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>



-- 
lucidworks.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message