lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: suppressing FreqProxPostingsArray
Date Mon, 19 Mar 2012 21:32:07 GMT
Hmm, I agree we could be more RAM efficient if the field is DOCS_ONLY.

We shouldn't have to allocate/use docFreqs, lastDocCodes,
lastPositions arrays (3 of the 7); the others are still needed, I

But, that said, you shouldn't hit OOME, as long as your max heap sizes
is large enough (and, your IndexWriterConfig's RAMBufferSizeMB is
small enough); Lucene should simply flush a new segment once the
buffered documents are using too much RAM.

Hmm, and you don't index massive documents.  How many UUIDs per document?

Mike McCandless

On Mon, Mar 19, 2012 at 3:29 PM, Ken McCracken <> wrote:
> Hi,
> I am using lucene-3.5 and getting an OutOfMemoryError on a large indexing
> task of 100M documents.  I am creating an index with 3 UUIDs as separate
> field values.  I am using Store.YES on 1 of them and Store.NO on the
> others; I am using Index.NOT_ANALYZED_NO_NORMS on all three; explicitly
> setting
> field.setIndexOptions(IndexOptions.DOCS_ONLY);          and
> indexWriterConfig.setTermIndexInterval(termIndexInterval);   to 1024.  I am
> trying to index 100M records into my index.
> Is there any reason FreqProxTermsWriterPerField.FreqProxPostingsArray needs
> to be constructed even though I have the positions etc suppressed?  It
> seems that the reason I get an OutOfMemoryError is that 7 int[] of size
> proportional to number of unique fields are being constructed; however, at
> least some of them are probably wasteful given my indexing configurations.
> Any help is appreciated.
> Thanks,
> -Ken
>     [junit] Error:
>    [junit] Exception in thread "Thread-18" java.lang.OutOfMemoryError:
> Java heap space
>    [junit]     at
> org.apache.lucene.index.ParallelPostingsArray.<init>(
>    [junit]     at
> org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.<init>(
>    [junit]     at
> org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.newInstance(
>    [junit]     at
> org.apache.lucene.index.ParallelPostingsArray.grow(
>    [junit]     at
> org.apache.lucene.index.TermsHashPerField.growParallelPostingsArray(
>    [junit]     at
> org.apache.lucene.index.TermsHashPerField.add(
>    [junit]     at
> org.apache.lucene.index.DocInverterPerField.processFields(
>    [junit]     at
> org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message