lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject mmap loads the entire index into memory during forceMergeDeletes/forceMerge(int)
Date Thu, 17 Jan 2013 16:12:24 GMT
On a 256 Gb RAM machine, we have half of our IT system running.
Part of it, are 2 lucene applications, managing each a an approximate 100 Gb index.
These applications are used to index logging events, and every night there is a purge, followed
by a forceMergeDeletes to reclaim disk space (and I have configured forceMergeDeletesPctAllowed=0
to make sure I really do get the space back).
We have had production incidents  recently, because the resident size would go as high as
100 Gb on each lucene app process during forceMergeDeletes. After an investigation I figured
out the default behavior of, instantiating a MMapDirectory on 64 bits
I read, and I guess
I am seeing something different than what is stated in the article: "MMapDirectory will not
load the whole index into physical memory. Why should it do this? We just ask the operating
system to map the file into address space for easy access, by no means we are requesting more.
Java and the O/S optionally provide the option to try loading the whole file into RAM (if
enough is available), but Lucene does not use that option (we may add this possibility in
a later version)."

At least, in my case it does load the entire index into memory.

This behavior caught everybody by surprise, even app servers administrators that are used
to monitor heaps on java machines, not resident memory.
Mmap is appropriate when the lucene app is running alone on a physical box or a vm, because
if it wants to load the entire index into memory, it will swap out its own pages. But when
you use a shared machine, then you start hurting the other app servers.

I think you should raise awareness on mmap usage when dealing with a large index.



************************ DISCLAIMER ************************
This message is intended only for use by the person to
whom it is addressed. It may contain information that is
privileged and confidential. Its content does not constitute
a formal commitment by Lombard Odier & Cie or any of its
branches or affiliates. If you are not the intended recipient
of this message, kindly notify the sender immediately and
destroy this message. Thank You.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message