lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: tlog/commit questions
Date Wed, 03 Jul 2019 15:42:02 GMT
Let’s take this a piece at a time.

1. commit failures are very rare, in fact the only time I’ve seen them is when running out
of disk space, OOMs, pulling the plug, etc. Look in your log files, is there any evidence
of same?

2. OOM messages. To support Real Time Get, internal structures are kept for all docs that
have been indexed but no searcher has been opened to make visible. So you’re collecting
up to 30 minutes of updates. This _may_ be relevant to your OOM problem. So I’d recommend
dropping your soft commit interval to maybe 5 minutes.

3. Tlogs shouldn’t replay much. They only replay when Solr quits abnormally, OOM, kill -9,
pull the plug etc. When Solr is shut down gracefully, i.e. “bin/solr stop” etc, it should
commit before closing and should _not_ replay anything from the tlog. Of course you should
stop indexing while shutting down Solr…

4. There are lots of improvements in Solr 7x. Go to the latest Solr version (7.7.2) rather
than 7.3.1. That said, TMP has been around for a long, long time. The low-level process of
merging segments hasn’t been changed. One thing that _has_ changed is that TMP will now
respect the max segment size (5G) when optimizing or doing an expungeDeletes. And I strongly
recommend that you do neither of those unless you demonstrate need, just mentioning in case
you already do that. 

All in all, I’d recommend getting to the bottom of your OOM issues. Absent abnormal termination,
tlog replay really shouldn’t be happening. Are you totally sure that it was TLOG replay
and not a full sync from the leader?


> On Jul 3, 2019, at 12:36 AM, Avi Steiner <> wrote:
> Hi
> We had some cases with customers (Solr 5.3.1, one search node, one shard) with huge tlog
files (more than 1 GB).
> Our settings:
> <updateHandler class="solr.DirectUpdateHandler2">
> <autoCommit>
>                                <maxDocs>10000</maxDocs>
>                                <maxTime>30000</maxTime> <!-- 30 seconds
>                                <openSearcher>false</openSearcher> <!--
don't open a new searcher -->
>                </autoCommit>
>                <autoSoftCommit>
>                                <maxTime>1800000</maxTime> <!-- 30 minutes
> </autoSoftCommit>
>                <updateLog>
>                                <str name="dir">${}</str>
>                </updateLog>
>  </updateHandler>
> I don't have enough logs so I don't know if commit failed or not. I just remember there
were OOM messages.
> As you may know, during restart, Solr tries to replay from tlog. It may take a lot of
time. I tried to move the files to other location, started Solr and only after the core was
loaded, I moved tlog back to their original location. They were cleared after a while.
> So I have few questions:
>  1.  Do you have any idea for commit failures?
>  2.  Should we decrease the maxTime for hard commit or any other settings?
>  3.  Is there any way to replay tlog asynchronously (or disable it, so we will be able
to call it programmatically from our code in a separate thread), so Solr will be loaded more
>  4.  Is there any improvement in Solr 7.3.1?
> Thanks in advance
> Avi
> ________________________________
> This email and any attachments thereto may contain private, confidential, and privileged
material for the sole use of the intended recipient. Any review, copying, or distribution
of this email (or any attachments thereto) by others is strictly prohibited. If you are not
the intended recipient, please contact the sender immediately and permanently delete the original
and any copies of this email and any attachments thereto.

View raw message