lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Wang <john.w...@gmail.com>
Subject Re: 2.9 NRT w.r.t. sorting and field cache
Date Tue, 22 Sep 2009 23:02:55 GMT
I understand what you are saying. Let me detail what I am trying to say:

When "currently processed segments" are flushed down, merge may happen. When
merges happen, some of those "stable segments" will be invalidated, and so
will the fieldcache data keyed by them.

In a high update environment, such scenarios can happen quite often.

The way the default mergePolicy works is that small segments get merged into
the larger segments. Eventually, what will be invalidated would be a large
segment, and when that happens, a large chunk of the field cache would be
invalidated.

Furthermore, in the case where there are high updates, the stable segments
can be invalidate much sooner when there are deletes in those segments, and
I would guess the corresponding FieldCache needs to be adjusted. Not sure
how it is handled right now.

Just my two cents, and of course when I find the time I will need to run
some tests to see.

-John

On Tue, Sep 22, 2009 at 3:59 PM, Uwe Schindler <uwe@thetaphi.de> wrote:

>  The NRT reader coming from the IndexWriter.getReader() has only changes
> in the currently processed segments, the other segments keep stable (and
> even their IndexReader keys used for the FieldCache). The rest of the
> segments keep stable. For the consumer it looks like a normal reader (it is
> in fact a ReadOnlyDirectoryReader) supporting getSequentialSubReaders() and
> so on.
>
>
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>   ------------------------------
>
> *From:* John Wang [mailto:john.wang@gmail.com]
> *Sent:* Tuesday, September 22, 2009 9:32 AM
> *To:* java-dev@lucene.apache.org
> *Subject:* Re: 2.9 NRT w.r.t. sorting and field cache
>
>
>
> Thanks Mark for the pointer!
>
> I guess my point is with NRT, and when segment files change often, this
> would be an issue, no?
>
> Anyway, I can run some tests.
>
> Thanks
>
> -John
>
> On Tue, Sep 22, 2009 at 3:21 PM, Mark Miller <markrmiller@gmail.com>
> wrote:
>
> 1483 - indexsearcher pulls out a readers subreaders (segmentreaders) and
> sends a collector over them one by one, rather than using the multireader.
> So only fc for seg readers that change need to be reloaded.
>
> - Mark
>
>
>
> http://www.lucidimagination.com (mobile)
>
>
> On Sep 22, 2009, at 1:27 AM, John Wang <john.wang@gmail.com> wrote:
>
>  Hi Yonik:
>
>      Actually that is what I am looking for. Can you please point me to
> where/how sorting is done per-segment?
>
>      When heaving indexing introduces or modifies segments, would it cause
> reloading of FieldCache at query time and thus would impact search
> performance?
>
> thanks
>
> -John
>
> On Tue, Sep 22, 2009 at 1:05 PM, Yonik Seeley <<yonik@lucidimagination.com>
> yonik@lucidimagination.com> wrote:
>
> On Tue, Sep 22, 2009 at 12:56 AM, John Wang < <john.wang@gmail.com>
> john.wang@gmail.com> wrote:
> > Looking at the code, seems there is a disconnect between how/when field
> > cache is loaded when IndexWriter.getReader() is called.
>
> I'm not sure what you mean by "disconnect"
>
> > Is FieldCache updated?
>
> FieldCache entries are populated on demand, as they always have been.
>
>
> > Otherwise, are we reloading FieldCache for each
> > reader instance?
>
> Searching/sorting is now per-segment, and so is the use of the
> FieldCache.  Segments that don't change shouldn't have to reload their
> FieldCache entries.
>
> -Yonik
>  <http://www.lucidimagination.com>http://www.lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: <java-dev-unsubscribe@lucene.apache.org>
> java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: <java-dev-help@lucene.apache.org>
> java-dev-help@lucene.apache.org
>
>
>
>
>

Mime
View raw message