lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <>
Subject Re: BlockTreeTermsReader consumes crazy amount of memory
Date Thu, 11 Sep 2014 01:26:28 GMT
Thats because there are 3 constructors in segmentreader:

1. one used for opening new (checks hasDeletions, only reads liveDocs if so)
2. one used for non-NRT reopen <-- problem one for you
3. one used for NRT reopen (takes a LiveDocs as a param, so no bug)

so personally i think you should be able to do this, we just have to
add the hasDeletions check to #2

On Wed, Sep 10, 2014 at 7:46 PM, Vitaly Funstein <> wrote:
> One other observation - if instead of a reader opened at a later commit
> point (T1), I pass in an NRT reader *without* doing the second commit on
> the index prior, then there is no exception. This probably also hinges on
> the assumption that no buffered docs have been flushed after T0, thus
> creating new segment files, as well... unfortunately, our system can't make
> either assumption.
> On Wed, Sep 10, 2014 at 4:30 PM, Vitaly Funstein <>
> wrote:
>> Normally, reopens only go forwards in time, so if you could ensure
>>> that when you reopen one reader to another, the 2nd one is always
>>> "newer", then I think you should never hit this issue
>> Mike, I'm not sure if I fully understand your suggestion. In a nutshell,
>> the use here case is as follows: I want to be able to search the index at a
>> particular point in time, let's call it T0. To that end, I save the state
>> at that time via a commit and take a snapshot of the index. After that, the
>> index is free to move on, to another point in time, say T1 - and likely
>> does. The optimization we have been discussing (and this is what the test
>> code I posted does) basically asks the reader to go back to point T0, while
>> reusing as much of the state of the index from T1, as long as it is
>> unchanged between the two.
>> That's what DirectoryReader.openIfChanged(DirectoryReader, IndexCommit) is
>> supposed to do internally... or am I misinterpreting the
>> intent/implementation of it?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message