lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rallavagu <rallav...@gmail.com>
Subject Re: tlog replay
Date Fri, 09 Oct 2015 00:38:01 GMT
Erick,

Actually, configured autocommit to 15 seconds and openSearcher is set to 
false. Neither 2 nor 3 happened. However, softCommit is set to 10 min.

<autoCommit>
        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
        <openSearcher>false</openSearcher>
      </autoCommit>

Working on upgrading to 5.3 which will take a bit of time and trying to 
get this under control until that time.

On 10/8/15 5:28 PM, Erick Erickson wrote:
> right, so the scenario is
> 1> somehow you didn't do a hard commit (openSearcher=true or false
> doesn't matter) for a really long time while indexing.
> 2> Solr abnormally terminated.
> 3> When Solr started back up it replayed the entire log.
>
> How <1> happened is the mystery though. With a hard commit
> (autocommit) interval of 15 seconds that's weird.
>
> The message indicates something like that happened. In very recent
> Solr versions, the log will have
> progress messages printed that'll help see this is happening.
>
> Best,
> Erick
>
> On Thu, Oct 8, 2015 at 12:23 PM, Rallavagu <rallavagu@gmail.com> wrote:
>> As a follow up.
>>
>> Eventually the tlog file is disappeared (could not track the time it took to
>> clear out completely). However, following messages were noticed in
>> follower's log.
>>
>> 5120638 [recoveryExecutor-14-thread-2] WARN org.apache.solr.update.UpdateLog
>> – Starting log replay tlog
>>
>> On 10/7/15 8:29 PM, Erick Erickson wrote:
>>>
>>> The only way I can account for such a large file off the top of my
>>> head is if, for some reason,
>>> the Solr on the node somehow was failing to index documents and kept
>>> adding them to the
>>> log for a loooonnngggg time. But how that would happen without the
>>> node being in recovery
>>> mode I'm not sure. I mean the Solr instance would have to be healthy
>>> otherwise but just not
>>> able to index docs which makes no sense.
>>>
>>> The usual question here is whether there were any messages in the solr
>>> log file indicating
>>> problems while this built up.
>>>
>>> tlogs will build up to very large sizes if there are very long hard
>>> commit intervals, but I don't
>>> see how that interval would be different on the leader and follower.
>>>
>>> So color me puzzled.
>>>
>>> Best,
>>> Erick
>>>
>>> On Wed, Oct 7, 2015 at 8:09 PM, Rallavagu <rallavagu@gmail.com> wrote:
>>>>
>>>> Thanks Erick.
>>>>
>>>> Eventually, followers caught up but the 14G tlog file still persists and
>>>> they are healthy. Is there anything to look for? Will monitor and see how
>>>> long will it take before it disappears.
>>>>
>>>> Evaluating move to Solr 5.3.
>>>>
>>>> On 10/7/15 7:51 PM, Erick Erickson wrote:
>>>>>
>>>>>
>>>>> Uhm, that's very weird. Updates are not applied from the tlog. Rather
>>>>> the
>>>>> raw doc is forwarded to the replica which both indexes the doc and
>>>>> writes it to the local tlog. So having a 14G tlog on a follower but a
>>>>> small
>>>>> tlog on the leader is definitely strange, especially if it persists over
>>>>> time.
>>>>>
>>>>> I assume the follower is healthy? And does this very large tlog
>>>>> disappear
>>>>> after a while? I'd expect it to be aged out after a few commits of >
100
>>>>> docs.
>>>>>
>>>>> All that said, there have been a LOT of improvements since 4.6, so it
>>>>> might
>>>>> be something that's been addressed in the intervening time.
>>>>>
>>>>> Best,
>>>>> Erick
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Oct 7, 2015 at 7:39 PM, Rallavagu <rallavagu@gmail.com>
wrote:
>>>>>>
>>>>>>
>>>>>> Solr 4.6.1, single shard, 4 node cloud, 3 node zk
>>>>>>
>>>>>> Like to understand the behavior better when large number of updates
>>>>>> happen
>>>>>> on leader and it generates huge tlog (14G sometimes in my case) on
>>>>>> other
>>>>>> nodes. At the same time leader's tlog is few KB. So, what is the
rate
>>>>>> at
>>>>>> which the changes from transaction log are applied at nodes? The
>>>>>> autocommit
>>>>>> interval is set to 15 seconds after going through
>>>>>>
>>>>>>
>>>>>> https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>>>>>>
>>>>>> Thanks

Mime
View raw message