lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Davis <johndavis925...@gmail.com>
Subject Re: Solr Heap Usage
Date Fri, 07 Jun 2019 18:29:28 GMT
What would be the best way to understand where heap is being used?

On Tue, Jun 4, 2019 at 9:31 PM Greg Harris <harrisgreg07@gmail.com> wrote:

> Just a couple of points I’d make here. I did some testing a while back in
> which if no commit is made, (hard or soft) there are internal memory
> structures holding tlogs and it will continue to get worse the more docs
> that come in. I don’t know if that’s changed in further versions. I’d
> recommend doing commits with some amount of frequency in indexing heavy
> apps, otherwise you are likely to have heap issues. I personally would
> advocate for some of the points already made. There are too many variables
> going on here and ways to modify stuff to make sizing decisions and think
> you’re doing anything other than a pure guess if you don’t test and
> monitor. I’d advocate for a process in which testing is done regularly to
> figure out questions like number of shards/replicas, heap size, memory etc.
> Hard data, good process and regular testing will trump guesswork every time
>
> Greg
>
> On Tue, Jun 4, 2019 at 9:22 AM John Davis <johndavis925254@gmail.com>
> wrote:
>
> > You might want to test with softcommit of hours vs 5m for heavy indexing
> +
> > light query -- even though there is internal memory structure overhead
> for
> > no soft commits, in our testing a 5m soft commit (via commitWithin) has
> > resulted in a very very large heap usage which I suspect is because of
> > other overhead associated with it.
> >
> > On Tue, Jun 4, 2019 at 8:03 AM Erick Erickson <erickerickson@gmail.com>
> > wrote:
> >
> > > I need to update that, didn’t understand the bits about retaining
> > internal
> > > memory structures at the time.
> > >
> > > > On Jun 4, 2019, at 2:10 AM, John Davis <johndavis925254@gmail.com>
> > > wrote:
> > > >
> > > > Erick - These conflict, what's changed?
> > > >
> > > > So if I were going to recommend settings, they’d be something like
> > this:
> > > > Do a hard commit with openSearcher=false every 60 seconds.
> > > > Do a soft commit every 5 minutes.
> > > >
> > > > vs
> > > >
> > > > Index-heavy, Query-light
> > > > Set your soft commit interval quite long, up to the maximum latency
> you
> > > can
> > > > stand for documents to be visible. This could be just a couple of
> > minutes
> > > > or much longer. Maybe even hours with the capability of issuing a
> hard
> > > > commit (openSearcher=true) or soft commit on demand.
> > > >
> > >
> >
> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> > > >
> > > >
> > > >
> > > >
> > > > On Sun, Jun 2, 2019 at 8:58 PM Erick Erickson <
> erickerickson@gmail.com
> > >
> > > > wrote:
> > > >
> > > >>> I've looked through SolrJ, DIH and others -- is the bottomline
> > > >>> across all of them to "batch updates" and not commit as long as
> > > possible?
> > > >>
> > > >> Of course it’s more complicated than that ;)….
> > > >>
> > > >> But to start, yes, I urge you to batch. Here’s some stats:
> > > >> https://lucidworks.com/2015/10/05/really-batch-updates-solr-2/
> > > >>
> > > >> Note that at about 100 docs/batch you hit diminishing returns.
> > > _However_,
> > > >> that test was run on a single shard collection, so if you have 10
> > shards
> > > >> you’d
> > > >> have to send 1,000 docs/batch. I wouldn’t sweat that number much,
> just
> > > >> don’t
> > > >> send one at a time. And there are the usual gotchas if your
> documents
> > > are
> > > >> 1M .vs. 1K.
> > > >>
> > > >> About committing. No, don’t hold off as long as possible. When you
> > > commit,
> > > >> segments are merged. _However_, the default 100M internal buffer
> size
> > > means
> > > >> that segments are written anyway even if you don’t hit a commit
> point
> > > when
> > > >> you have 100M of index data, and merges happen anyway. So you won’t
> > save
> > > >> anything on merging by holding off commits.
> > > >> And you’ll incur penalties. Here’s more than you want to know
about
> > > >> commits:
> > > >>
> > > >>
> > >
> >
> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> > > >>
> > > >> But some key take-aways… If for some reason Solr abnormally
> > > >> terminates, the accumulated documents since the last hard
> > > >> commit are replayed. So say you don’t commit for an hour of
> > > >> furious indexing and someone does a “kill -9”. When you restart
> > > >> Solr it’ll try to re-index all the docs for the last hour. Hard
> > commits
> > > >> with openSearcher=false aren’t all that expensive. I usually set
> mine
> > > >> for a minute and forget about it.
> > > >>
> > > >> Transaction logs hold a window, _not_ the entire set of operations
> > > >> since time began. When you do a hard commit, the current tlog is
> > > >> closed and a new one opened and ones that are “too old” are deleted.
> > If
> > > >> you never commit you have a huge transaction log to no good purpose.
> > > >>
> > > >> Also, while indexing, in order to accommodate “Real Time Get”,
all
> > > >> the docs indexed since the last searcher was opened have a pointer
> > > >> kept in memory. So if you _never_ open a new searcher, that internal
> > > >> structure can get quite large. So in bulk-indexing operations, I
> > > >> suggest you open a searcher every so often.
> > > >>
> > > >> Opening a new searcher isn’t terribly expensive if you have no
> > > autowarming
> > > >> going on. Autowarming as defined in solrconfig.xml in filterCache,
> > > >> queryResultCache
> > > >> etc.
> > > >>
> > > >> So if I were going to recommend settings, they’d be something like
> > this:
> > > >> Do a hard commit with openSearcher=false every 60 seconds.
> > > >> Do a soft commit every 5 minutes.
> > > >>
> > > >> I’d actually be surprised if you were able to measure differences
> > > between
> > > >> those settings and just hard commit with openSearcher=true every 60
> > > >> seconds and soft commit at -1 (never)…
> > > >>
> > > >> Best,
> > > >> Erick
> > > >>
> > > >>> On Jun 2, 2019, at 3:35 PM, John Davis <johndavis925254@gmail.com>
> > > >> wrote:
> > > >>>
> > > >>> If we assume there is no query load then effectively this boils
> down
> > to
> > > >>> most effective way for adding a large number of documents to the
> solr
> > > >>> index. I've looked through SolrJ, DIH and others -- is the
> bottomline
> > > >>> across all of them to "batch updates" and not commit as long as
> > > possible?
> > > >>>
> > > >>> On Sun, Jun 2, 2019 at 7:44 AM Erick Erickson <
> > erickerickson@gmail.com
> > > >
> > > >>> wrote:
> > > >>>
> > > >>>> Oh, there are about a zillion reasons ;).
> > > >>>>
> > > >>>> First of all, most tools that show heap usage also count
> uncollected
> > > >>>> garbage. So your 10G could actually be much less “live”
data.
> Quick
> > > way
> > > >> to
> > > >>>> test is to attach jconsole to the running Solr and hit the
button
> > that
> > > >>>> forces a full GC.
> > > >>>>
> > > >>>> Another way is to reduce your heap when you start Solr (on
a test
> > > system
> > > >>>> of course) until bad stuff happens, if you reduce it to very
close
> > to
> > > >> what
> > > >>>> Solr needs, you’ll get slower as more and more cycles are
spent on
> > GC,
> > > >> if
> > > >>>> you reduce it a little more you’ll get OOMs.
> > > >>>>
> > > >>>> You can take heap dumps of course to see where all the memory
is
> > being
> > > >>>> used, but that’s tricky as it also includes garbage.
> > > >>>>
> > > >>>> I’ve seen cache sizes (filterCache in particular) be something
> that
> > > uses
> > > >>>> lots of memory, but that requires queries to be fired. Each
> > > filterCache
> > > >>>> entry can take up to roughly maxDoc/8 bytes + overhead….
> > > >>>>
> > > >>>> A classic error is to sort, group or facet on a docValues=false
> > field.
> > > >>>> Starting with Solr 7.6, you can add an option to fields to
throw
> an
> > > >> error
> > > >>>> if you do this, see:
> > https://issues.apache.org/jira/browse/SOLR-12962
> > > .
> > > >>>>
> > > >>>> In short, there’s not enough information until you dive
in and
> test
> > > >>>> bunches of stuff to tell.
> > > >>>>
> > > >>>> Best,
> > > >>>> Erick
> > > >>>>
> > > >>>>
> > > >>>>> On Jun 2, 2019, at 2:22 AM, John Davis <
> johndavis925254@gmail.com>
> > > >>>> wrote:
> > > >>>>>
> > > >>>>> This makes sense, any ideas why lucene/solr will use 10g
heap
> for a
> > > 20g
> > > >>>>> index.My hypothesis was merging segments was trying to
read it
> all
> > > but
> > > >> if
> > > >>>>> that's not the case I am out of ideas. The one caveat
is we are
> > > trying
> > > >> to
> > > >>>>> add the documents quickly (~1g an hour) but if lucene
does write
> > 100m
> > > >>>>> segments and does streaming merge it shouldn't matter?
> > > >>>>>
> > > >>>>> On Sat, Jun 1, 2019 at 9:24 AM Walter Underwood <
> > > wunder@wunderwood.org
> > > >>>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>>> On May 31, 2019, at 11:27 PM, John Davis <
> > > johndavis925254@gmail.com>
> > > >>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>> 2. Merging segments - does solr load the entire
segment in
> memory
> > > or
> > > >>>>>> chunks
> > > >>>>>>> of it? if later how large are these chunks
> > > >>>>>>
> > > >>>>>> No, it does not read the entire segment into memory.
> > > >>>>>>
> > > >>>>>> A fundamental part of the Lucene design is streaming
posting
> lists
> > > >> into
> > > >>>>>> memory and processing them sequentially. The same
amount of
> memory
> > > is
> > > >>>>>> needed for small or large segments. Each posting list
is in
> > > >> document-id
> > > >>>>>> order. The merge is a merge of sorted lists, writing
a new
> posting
> > > >> list
> > > >>>> in
> > > >>>>>> document-id order.
> > > >>>>>>
> > > >>>>>> wunder
> > > >>>>>> Walter Underwood
> > > >>>>>> wunder@wunderwood.org
> > > >>>>>> http://observer.wunderwood.org/  (my blog)
> > > >>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>>
> > > >>
> > > >>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message