lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: solrcloud Auto-commit doesn't seem reliable
Date Fri, 09 Feb 2018 21:07:41 GMT
Do you by any chance have buffering turned on for CDCR? That parameter
is misleading. If true, tlogs will accumulate forever. The blanket
recommendation is becoming turn buffering off and leave it off, the
original intention there has been replaced really by bootstrapping.
Buffering was there for maintenance windows before bootstrapping was
put in place. However, even if this is an issue, it shouldn't be
affecting your commits.

This is puzzling. CDCR is just using tlogs as a queueing mechanism.
Docs are sent from the source to the target cluster, but once received
they're _supposed_ to be just like any other doc that's indexed, i.e.
when the first one is received it should trigger the start of the
autocommit intervals.

Is there any possibility that:
1> your config isn't correct? I'm guessing this is just a typo:
"autoSoftcommit", but worth checking. Besides, that's not your main
concern anyway.
2> your startup parameters override your solrconfig settings for hard
commit interval and set it to -1 or something? When you start Solr you
should see the result of all the various ways you can set these
intervals "rolled up", look for messages like:

INFO  org.apache.solr.update.CommitTracker; Hard AutoCommit: if
uncommited for 15000ms;
INFO  org.apache.solr.update.CommitTracker; Soft AutoCommit: disabled


There's some chance that the admin console is misleading, but you've
seen the behavior change when you commit so that's unlikely to be the
root of your issue either, but mentioning it in passing.

bq: We see relvancy scores get inconsistent when there are too many
deletes which seems to happen when hard commits don't happen

right, the hard commit will trigger merging, which in turn removes the
terms associated with deleted documents in the segments that are
merged which in turn changes your TF/IDF stats. So this is another
piece of evidence that your getting unexpected behavior. And your
tlogs will accumulate forever too.

This is the first time I've ever heard of this problem, so I'm still
thinking that this is something odd about your setup, but what it is
escapes me from what you've said so far.

I want to check one other thing: You say you've seen this  behavior in
4.10. CDCR wasn't introduced until considerably later, so what was the
scenario in the 4.10 case? Is my tangent for CDCR just a red herring?

Best,
Erick


On Fri, Feb 9, 2018 at 8:29 AM, Webster Homer <webster.homer@sial.com> wrote:
> A little more background. Our production Solrclouds are populated via CDCR,
> CDCR does not replicate commits, Commits to the target clouds happen via
> autoCommit settings
>
> We see relvancy scores get inconsistent when there are too many deletes
> which seems to happen when hard commits don't happen.
>
> On Fri, Feb 9, 2018 at 10:25 AM, Webster Homer <webster.homer@sial.com>
> wrote:
>
>> I we do have autoSoftcommit set to 3 seconds. It is NOT the visibility of
>> the records that is my primary concern. I am concerned about is the
>> accumulation of uncommitted tlog files and the larger number of deleted
>> documents.
>>
>> I am VERY familiar with the Solr documentation on this.
>>
>> On Fri, Feb 9, 2018 at 10:08 AM, Shawn Heisey <apache@elyograg.org> wrote:
>>
>>> On 2/9/2018 8:44 AM, Webster Homer wrote:
>>>
>>>> I look at the latest timestamp on a record in the collection and see that
>>>> it is over 24 hours old.
>>>>
>>>> I send a commit to the collection, and then see that the core is now
>>>> current, and the segments are fewer. The commit worked
>>>>
>>>> This is the setting in solrconfig.xml
>>>> <autoCommit> <maxTime>${solr.autoCommit.maxTime:60000}</maxTime>
<
>>>> openSearcher>false</openSearcher> </autoCommit>
>>>>
>>>
>>> As recommended, you have openSearcher set to false.
>>>
>>> This means that these commits are NEVER going to make changes visible.
>>>
>>> Don't go and change openSearcher to true.  It is STRONGLY recommended to
>>> have openSearcher=false in your autoCommit settings.  The reason for this
>>> configuration is that it prevents the transaction log from growing out of
>>> control.  With openSearcher=false, those commits will be very fast.  This
>>> is because it's opening the searcher that's slow, not the process of
>>> writing data to disk.
>>>
>>> Here's the recommended reading on the subject:
>>>
>>> https://lucidworks.com/understanding-transaction-logs-softco
>>> mmit-and-commit-in-sorlcloud/
>>>
>>> For change visibility, configure autoSoftCommit, probably with a
>>> different interval than you have for autoCommit.  I would recommend a
>>> longer interval.  Or include the commitWithin parameter on at least some of
>>> your update requests.  Or send explicit commit requests, preferably as soft
>>> commits.
>>>
>>> Thanks,
>>> Shawn
>>>
>>
>>
>
> --
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://www.emdgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.

Mime
View raw message