lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: BlockTreeTermsReader consumes crazy amount of memory
Date Thu, 11 Sep 2014 01:35:11 GMT
Yes, there is also a safety check, but IMO it should be removed.

See the patch on the issue, the test passes now.

On Wed, Sep 10, 2014 at 9:31 PM, Vitaly Funstein <vfunstein@gmail.com> wrote:
> Seems to me the bug occurs regardless of whether the passed in newer reader
> is NRT or non-NRT. This is because the user operates at the level of
> DirectoryReader, not SegmentReader and modifying the test code to do the
> following reproduces the bug:
>
>     writer.commit();
>     DirectoryReader latest = DirectoryReader.open(writer, true);
>
>     // This reader will be used for searching against commit point 1
>     DirectoryReader searchReader = DirectoryReader.openIfChanged(latest,
> ic1); //  <=== Exception/Assertion thrown here
>
>
> On Wed, Sep 10, 2014 at 6:26 PM, Robert Muir <rcmuir@gmail.com> wrote:
>
>> Thats because there are 3 constructors in segmentreader:
>>
>> 1. one used for opening new (checks hasDeletions, only reads liveDocs if
>> so)
>> 2. one used for non-NRT reopen <-- problem one for you
>> 3. one used for NRT reopen (takes a LiveDocs as a param, so no bug)
>>
>> so personally i think you should be able to do this, we just have to
>> add the hasDeletions check to #2
>>
>> On Wed, Sep 10, 2014 at 7:46 PM, Vitaly Funstein <vfunstein@gmail.com>
>> wrote:
>> > One other observation - if instead of a reader opened at a later commit
>> > point (T1), I pass in an NRT reader *without* doing the second commit on
>> > the index prior, then there is no exception. This probably also hinges on
>> > the assumption that no buffered docs have been flushed after T0, thus
>> > creating new segment files, as well... unfortunately, our system can't
>> make
>> > either assumption.
>> >
>> > On Wed, Sep 10, 2014 at 4:30 PM, Vitaly Funstein <vfunstein@gmail.com>
>> > wrote:
>> >
>> >> Normally, reopens only go forwards in time, so if you could ensure
>> >>> that when you reopen one reader to another, the 2nd one is always
>> >>> "newer", then I think you should never hit this issue
>> >>
>> >>
>> >> Mike, I'm not sure if I fully understand your suggestion. In a nutshell,
>> >> the use here case is as follows: I want to be able to search the index
>> at a
>> >> particular point in time, let's call it T0. To that end, I save the
>> state
>> >> at that time via a commit and take a snapshot of the index. After that,
>> the
>> >> index is free to move on, to another point in time, say T1 - and likely
>> >> does. The optimization we have been discussing (and this is what the
>> test
>> >> code I posted does) basically asks the reader to go back to point T0,
>> while
>> >> reusing as much of the state of the index from T1, as long as it is
>> >> unchanged between the two.
>> >>
>> >> That's what DirectoryReader.openIfChanged(DirectoryReader, IndexCommit)
>> is
>> >> supposed to do internally... or am I misinterpreting the
>> >> intent/implementation of it?
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message