lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Schlansker <>
Subject Re: DocValues formats hold large byte[][]s even when using MMapDirectory
Date Wed, 02 Oct 2013 18:37:57 GMT

On Oct 2, 2013, at 11:16 AM, Michael McCandless <> wrote:

> In Lucene 4.5 (coming out any day now) we've switched by default to a
> "mostly on disk" impl for doc values.

Awesome!  Looking forward to that then.

> Before that, you can use DiskDocValuesFormat instead.
> But you'll need to re-index (or create a new index and use
> IW.addIndexes) to cutover your current index to the DiskDVFormat.

I see a few references scattered on the internet but it's not in my Lucene jars.  The one
reference I saw to it indicated that every patch release of Lucene will require a full reindex
when using this, which is a serious bummer.

So I think I'll hold out for 4.5 and hope that that solves my problem.
Thanks for the help!

> On Wed, Oct 2, 2013 at 2:11 PM, Steven Schlansker <> wrote:
>> Hi,
>> I have a search application using Lucene 4.4.0 with various BinaryDocValues and SortedSetDocValues.
>> We use MMapDirectory to help keep the Java heap small / GC pause times short and
instead rely on the OS buffer cache to keep things fast, which I gather is generally considered
a "best practice" around here.
>> As our index grows, I've noticed that we are getting GC pauses and later OOM errors
when reloading a new index due to gigabytes of byte[][]s held by Lucene42DocValuesProducer,
specifically the PagedBytes.Reader.blocks from within Lucene42DocValuesProducer.loadBinary
>> I would have expected DocValues fields to use mapped bytes instead of copying into
the Java heap much as the "main" index data is.  Is this a technical limitation, a "we haven't
gotten there yet" feature request, or something different entirely?
>> Thanks for helping my understanding,
>> Steven
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message