lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Java caching of low-level index data?
Date Thu, 23 Jul 2009 14:50:31 GMT
On Thu, Jul 23, 2009 at 10:03 AM, Nigel<> wrote:

> Mike, the question you raise is whether (or to what degree) the OS will swap
> out app memory in favor of IO cache.  I don't know anything about how the
> Linux kernel makes those decisions, but I guess I had hoped that (regardless
> of the swappiness setting) it would be less likely to swap out application
> memory for IO, than it would be to replace some cached IO data with some
> different cached IO data.

I think swappiness is exactly the configuration that tells Linux just
how happily it should swapp out application memory for IO cache vs
other IO cache for new IO cache.

> The latter case is what kills Lucene performance
> when you've got a lot of index data in the IO cache and a file copy or some
> other operation replaces it all with something else: the OS has no way of
> knowing that some IO cache is more desirable long-term than other IO
> cache.

I agree that hurts Lucene, but the former also hurts Lucene.  EG if
the OS swaps out our norms, terms index, deleted docs, field cache,
then that's gonna hurt search performance.  You hit maybe 10 page faults
and suddenly you're looking at an unacceptable increase in the search

For a dedicated search box (your case) it'd be great to wire these
pages (or, set swappiness to 0 and make sure you have plenty of RAM,
which is supposed to do the same thing I believe).

> The former case (swapping app for IO cache) makes sense, I suppose, if the
> app memory hasn't been used in a long time, but with an LRU cache you should
> be hitting those pages pretty frequently by definition.

EG if your terms index is large, I bet many pages will be seen by the
OS as rarely used.  We do a binary search through it... so the upper
levels of that binary search tree are frequently hit, but the lower
levels will be much less frequently hit.  I can see the OS happily
swapping out big chunks of the terms dict index.  And it's quite costly
because we don't have good locality in how we access it (except
towards the very end of the binary search).

> But if it does swap out your Java cache for something else, you're
> probably no worse off than before, right?  In this case you have to
> hit the disk to fault in the paged-out cache; in the original case
> you have to hit the disk to read the index data that's not in IO
> cache.

Hard to say... if it swaps out the postings, since we tend to access
them sequentially, we have good locality and so swapping back in
should be faster (I *think*).  I guess norms, field cache and deleted
docs also have good locality.  Though... I'm actually not sure how
effectively VM systems take advantage of locality when page faults are

> Anyway, the interaction between these things (virtual memory, IO cache,
> disk, JVM, garbage collection, etc.) are complex and so the optimal
> configuration is very usage-dependent.  The current Lucene behavior seems to
> be the most flexible.  When/if I get a chance to try the Java caching for
> our situation I'll report the results.

I think the biggest low-hanging-fruit in this area would be an
optional JNI-based extension to Lucene that'd allow merging to tell
the OS *not* to cache the bytes that we are reading, and to optimizing
those file descriptors for sequential access (eg do aggressive
readahead).  It's a nightmare that a big segment merge can evict not
only IO cache but also (with the default swappiness on most Linux
distros) evict our in-RAM caches too!


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message