lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xie, Sean" <Sean....@finra.org>
Subject Re: CDCR - how to deal with the transaction log files
Date Sat, 08 Jul 2017 17:14:01 GMT
I have monitored the CDCR process for a while, the updates are actively sent to the target
without a problem. However the tlog size and files count are growing everyday, even when there
is 0 updates to sent, the tlog stays there:

Following is from the action=queues command, and you can see after about a month or so running
days, the total transaction are reaching to 140K total files, and size is about 103G.

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">465</int>
</lst>
<lst name="queues">
<lst name="some_zk_url_list">
<lst name="MY_COLLECTION">
<long name="queueSize">0</long>
<str name="lastTimestamp">2017-07-07T23:19:09.655Z</str>
</lst>
</lst>
</lst>
<long name="tlogTotalSize">102740042616</long>
<long name="tlogTotalCount">140809</long>
<str name="updateLogSynchronizer">stopped</str>
</response>

Any help on it? Or do I need to configure something else? The CDCR configuration is pretty
much following the wiki:

On target:

  <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
    <lst name="buffer">
      <str name="defaultState">disabled</str>
    </lst>
  </requestHandler>

  <updateRequestProcessorChain name="cdcr-processor-chain">
    <processor class="solr.CdcrUpdateProcessorFactory"/>
    <processor class="solr.RunUpdateProcessorFactory"/>
  </updateRequestProcessorChain>

  <requestHandler name="/update" class="solr.UpdateRequestHandler">
    <lst name="defaults">
      <str name="update.chain">cdcr-processor-chain</str>
    </lst>
  </requestHandler>

  <updateHandler class="solr.DirectUpdateHandler2">
    <updateLog class="solr.CdcrUpdateLog">
      <str name="dir">${solr.ulog.dir:}</str>
    </updateLog>
    <autoCommit> 
      <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
      <openSearcher>false</openSearcher> 
    </autoCommit>

    <autoSoftCommit> 
      <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
    </autoSoftCommit>     
  </updateHandler>

On source:
  <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
    <lst name="replica">
      <str name="zkHost">${TargetZk}</str>
      <str name="source">MY_COLLECTION</str>
      <str name="target">MY_COLLECTION</str>
    </lst>

    <lst name="replicator">
      <str name="threadPoolSize">1</str>
      <str name="schedule">1000</str>
      <str name="batchSize">128</str>
    </lst>

    <lst name="updateLogSynchronizer">
      <str name="schedule">60000</str>
    </lst>
  </requestHandler>

  <updateHandler class="solr.DirectUpdateHandler2">
    <updateLog class="solr.CdcrUpdateLog">
      <str name="dir">${solr.ulog.dir:}</str>
    </updateLog>
    <autoCommit> 
      <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
      <openSearcher>false</openSearcher> 
    </autoCommit>

    <autoSoftCommit> 
      <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
    </autoSoftCommit>     
  </updateHandler>

Thanks.
Sean

On 7/8/17, 12:10 PM, "Erick Erickson" <erickerickson@gmail.com> wrote:

    This should not be the case if you are actively sending updates to the
    target cluster. The tlog is used to store unsent updates, so if the
    connection is broken for some time, the target cluster will have a
    chance to catch up.
    
    If you don't have the remote DC online and do not intend to bring it
    online soon, you should turn CDCR off.
    
    Best,
    Erick
    
    On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <Sean.Xie@finra.org> wrote:
    > Once enabled CDCR, update log stores an unlimited number of entries. This is causing
the tlog folder getting bigger and bigger, as well as the open files are growing. How can
one reduce the number of open files and also to reduce the tlog files? If it’s not taken
care properly, sooner or later the log files size and open file count will exceed the limits.
    >
    > Thanks
    > Sean
    >
    >
    > Confidentiality Notice::  This email, including attachments, may include non-public,
proprietary, confidential or legally privileged information.  If you are not an intended recipient
or an authorized agent of an intended recipient, you are hereby notified that any dissemination,
distribution or copying of the information contained in or transmitted with this e-mail is
unauthorized and strictly prohibited.  If you have received this email in error, please notify
the sender by replying to this message and permanently delete this e-mail, its attachments,
and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment
for any purpose, nor disclose all or any part of the contents to any other person. Thank you.
    

Mime
View raw message