lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <>
Subject Re: Some thoughts around the use of reader.isDeleted and hasDeletions
Date Thu, 18 Jun 2009 20:29:43 GMT
I've made the changes to SegmentMerger and want to make the following
changes to IndexReader.document(): (1) don't call ensureOpen() and (2) don't
check isDeleted.

Question is - can I make these changes on the current impls, or do I need to
deprecate and come up w/ a new name? Here a new name is not a big challenge
- we can choose: doc() or getDocument() for example. I don't feel
rawDocument flows nicely (what's "raw" about it?)

IMO, even though these are back-compat changes (to runtime), they are not
likely to affect anyone. I mean, why would someone deliberately call
document() when the reader has already been closed (unless he doesn't know
it at the time of calling document()). For easy migration (if you rely on
that feature), I can add isClose()/isOpen() w/ a default impl to call

Or why to call document(doc) if the doc is deleted. What's the scenario?

Anyway, those two changes are necessary as our merging code calls them, but
already check that a doc is deleted or not before. So it's just a question
of a new method vs. a runtime change.

What do you think?


On Wed, Jun 10, 2009 at 6:39 PM, Yonik Seeley <>wrote:

> On Wed, Jun 10, 2009 at 11:16 AM, Shai Erera <> wrote:
> >> it makes sense because isDeleted() is essentially the *only* thing
> >> being done in the loop, and hence we can eliminate the loop entirely
> >
> > You mean that in case there is a matching segment, we can call
> > matchingVectorsReader.rawDocs(rawDocLengths, rawDocLengths2, 0, maxDoc)?
> Right... or rather directly calculate numDocs and docNum instead of
> using the loop.
> > But in case it does not have a matching segment, we'd still need to
> iterate
> > on the docs, and copy the term vectors one by one, right?
> Right, and that's the case where I think duplicating the code to
> remove a single branch-predictable boolean flag isn't warranted as it
> won't result in a measurable performance increase.
> -Yonik
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

View raw message