lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Optimize FTS memory footprint
Date Wed, 13 Dec 2017 00:22:32 GMT
Comments below:

On Tue, Nov 28, 2017 at 4:47 PM, elirev <> wrote:

> Thanks   Mike .
> I did not  find  any  clear  way to know it its FST or Norm , or something
> else ( unless i miss something )  the fact the FST is an in memory prefix
> index lead me to think it using most of the heap   .
> Our  mapping is normal with around of 200 columns one of the columns is
> nested object with limited amount of objects (up to 4 instances  )   , we
> are using monthly base indexes  (keep 6 month open ) . In last month  i see
> dramatic extra  allocation on the segment memory (around 30% where in
> regulare    month is around 5%)  , the only change i see is that the nested
> object is now include avg 8 instances  )  , this increases the amount of
> the hidden document we have now on the  index (about more then twice) .
> When we optimize the index the amount of allocation memory was reduced (we
> see it only after rolling restart the nodes )   .
> If you don't mind  i have few question :
> 1) Do you know about an  way  to figure   out which component is taking all
> this memory .

How about a profiler?  E.g. YourKit works well for this in my experience.

You should also use Lucene's own Accountable API from the leaf readers --
this gives you a very detailed breakdown of Lucene's own accounting of
what's using RAM.

> 2) Do you see relation between the fact that the nested objects was
> increases to the extra memory allocation we have ?

Well, nested objects cause more documents to be created which may increase
heap needed to hold some data structures e.g. the live docs bitset.

> 3) Did FST memory usage is  impacted by the fact we optimize the
> problematic
> index  and why  we see it only after restarting ES service

Optimize also removes all pending deleted documents.  You should pull the
Accountables before and after the optimize to see where the changes really

Mike McCandless

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message