lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Earwin Burrfoot <ear...@gmail.com>
Subject Re: Some thoughts around the use of reader.isDeleted and hasDeletions
Date Fri, 19 Jun 2009 03:18:36 GMT
Runtime change. Hard to imagine people relying on failing document() call.

On Fri, Jun 19, 2009 at 00:29, Shai Erera<serera@gmail.com> wrote:
> I've made the changes to SegmentMerger and want to make the following
> changes to IndexReader.document(): (1) don't call ensureOpen() and (2) don't
> check isDeleted.
>
> Question is - can I make these changes on the current impls, or do I need to
> deprecate and come up w/ a new name? Here a new name is not a big challenge
> - we can choose: doc() or getDocument() for example. I don't feel
> rawDocument flows nicely (what's "raw" about it?)
>
> IMO, even though these are back-compat changes (to runtime), they are not
> likely to affect anyone. I mean, why would someone deliberately call
> document() when the reader has already been closed (unless he doesn't know
> it at the time of calling document()). For easy migration (if you rely on
> that feature), I can add isClose()/isOpen() w/ a default impl to call
> ensureOpen().
>
> Or why to call document(doc) if the doc is deleted. What's the scenario?
>
> Anyway, those two changes are necessary as our merging code calls them, but
> already check that a doc is deleted or not before. So it's just a question
> of a new method vs. a runtime change.
>
> What do you think?
>
> Shai
>
> On Wed, Jun 10, 2009 at 6:39 PM, Yonik Seeley <yonik@lucidimagination.com>
> wrote:
>>
>> On Wed, Jun 10, 2009 at 11:16 AM, Shai Erera <serera@gmail.com> wrote:
>> >> it makes sense because isDeleted() is essentially the *only* thing
>> >> being done in the loop, and hence we can eliminate the loop entirely
>> >
>> > You mean that in case there is a matching segment, we can call
>> > matchingVectorsReader.rawDocs(rawDocLengths, rawDocLengths2, 0, maxDoc)?
>>
>> Right... or rather directly calculate numDocs and docNum instead of
>> using the loop.
>>
>> > But in case it does not have a matching segment, we'd still need to
>> > iterate
>> > on the docs, and copy the term vectors one by one, right?
>>
>> Right, and that's the case where I think duplicating the code to
>> remove a single branch-predictable boolean flag isn't warranted as it
>> won't result in a measurable performance increase.
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message