Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 79BAD17C7A for ; Mon, 26 Jan 2015 21:44:43 +0000 (UTC) Received: (qmail 74952 invoked by uid 500); 26 Jan 2015 21:44:35 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 74857 invoked by uid 500); 26 Jan 2015 21:44:35 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 73716 invoked by uid 99); 26 Jan 2015 21:44:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Jan 2015 21:44:35 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sekhrivijay@gmail.com designates 209.85.214.176 as permitted sender) Received: from [209.85.214.176] (HELO mail-ob0-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Jan 2015 21:44:09 +0000 Received: by mail-ob0-f176.google.com with SMTP id va2so10251645obc.7 for ; Mon, 26 Jan 2015 13:43:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=vbehJDrhluamsBYFyA5ObAy4voWf9A+UA1PeIPVdF1A=; b=lgOOUYAkCa5l+yI4bl3XVA3Q55TEbfaJQWfMR3+WGDXVikxOvq746ZdxksVjpzX6sf K6QBdEh2bTvQbKxLSauAz0au9VCDK0b/cowMBPCXyB0a3LbSIIpfYJjFfcE7curTjVnQ RvGxoLajsGWb7EQklnPJ7ZsLxt+/yUewhkwbJ9MhrW6cELLIpUdtPX31/QvfQxnteBrF Nj8XG1ZnHy2he4GGrGziV4j+mp8mvtiduF6j46IMUHqPAQHI+57gnYEBTSUFR2GQR59X P/bJ85Scv3C3o/YcPfDFUh34tQWti/O/DPRbf2Ssr4cqHrf0H1tSZSY29kkLnLprxc9U bHVA== MIME-Version: 1.0 X-Received: by 10.202.1.131 with SMTP id 125mr13484824oib.31.1422308601950; Mon, 26 Jan 2015 13:43:21 -0800 (PST) Received: by 10.202.207.204 with HTTP; Mon, 26 Jan 2015 13:43:21 -0800 (PST) Date: Mon, 26 Jan 2015 15:43:21 -0600 Message-ID: Subject: solr cloud replicas goes in recovery mode after update From: Vijay Sekhri To: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary=001a1137db26faf7cf050d950657 X-Virus-Checked: Checked by ClamAV on apache.org --001a1137db26faf7cf050d950657 Content-Type: text/plain; charset=UTF-8 Hi Erick, The older message seems to be deleted so I am sending a new one http://osdir.com/ml/solr-user.lucene.apache.org/2015-01/msg00773.html In solr.xml file I had zk timeout set to* ${zkClientTimeout:450000}* One thing that made a it a bit better now is the zk tick time and syncLimit settings. I set it to a higher value as below. This may not be advisable though. tickTime=30000 initLimit=30 syncLimit=20 Now we observed that replicas do not go in recovery that often as before. In the whole cluster at a given time I would have a couple of replicas in recovery whereas earlier it were multiple replicas from every shard . On the wiki https://wiki.apache.org/solr/SolrCloud it says the "The maximum is 20 times the tickTime." in the FAQ so I decided to increase the tick time. Is this the correct approach ? One question I have is that if auto commit settings has anything to do with this or not ? Does it induce extra work for the searchers because of which this would happen? I have tried with following settings * * * 500000* * 900000* * * * * * 200000* * 30000* * false* * * I have increased the heap size to 15GB for each JVM instance . I monitored during full indexing how the heap usage looks like and it never goes beyond 8 GB . I don't see any Full GC happening at any point . Our rate is a variable rate . It is not a sustained rate of 6000/second , however there are intervals where it would reach that much and come down and grow again and come down. So if I would take an average it would be 600/second only but that is not real rate at any given time. Version of solr cloud is *4.10*. All indexers are basically java programs running on different host using CloudSolrServer api. As I mentioned it is much better now than before , however not completely as expected .If possible we would like to have none of them go in recovery I captured some logs before and after recovery 14:16:52,774 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [IFD][recoveryExecutor-7-thread-1]: delete "_5r2_1.del" 14:16:52,774 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [IFD][recoveryExecutor-7-thread-1]: 0 msec to checkpoint 14:16:52,774 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [IFD][recoveryExecutor-7-thread-1]: now checkpoint "_4qe(4.10.0):C4312879/1nts ; isCommit = false] 14:16:52,774 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [IFD][recoveryExecutor-7-thread-1]: delete "_5r4_1.del" 14:16:52,774 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [IFD][recoveryExecutor-7-thread-1]: 0 msec to checkpoint 14:16:52,775 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [DW][recoveryExecutor-7-thread-1]: recoveryExecutor-7-thread-1 finishFullFlush success=true 14:16:52,775 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: findMerges: 34 segments 14:16:52,777 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_554(4.10.0):C3995865/780418:delGen=23 size=3669.307 MB [skip: too large] 14:16:52,777 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_4qe(4.10.0):C4312879/1370113:delGen=57 size=3506.254 MB [skip: too large] 14:16:52,777 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5co(4.10.0):C871785/93995:delGen=11 size=853.668 MB 14:16:52,777 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5kb(4.10.0):C424868/49572:delGen=12 size=518.704 MB 14:16:52,777 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5hm(4.10.0):C457977/83353:delGen=12 size=470.422 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_56u(4.10.0):C286775/11906:delGen=15 size=312.952 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5f5(4.10.0):C116528/43621:delGen=2 size=95.529 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5m7(4.10.0):C122852/54010:delGen=12 size=84.949 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5p7(4.10.0):C37457/649:delGen=2 size=54.241 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5of(4.10.0):C38545/5677:delGen=1 size=50.672 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qm(4.10.0):C29770/4:delGen=2 size=34.052 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5o8(4.10.0):C27155/7531:delGen=1 size=31.008 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5r2(4.10.0):C32769/4:delGen=2 size=27.410 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5r3(4.10.0):C26057 size=23.893 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5r4(4.10.0):C23934/3:delGen=2 size=22.004 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5nx(4.10.0):C33236/20669:delGen=2 size=19.861 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5r5(4.10.0):C24222/1:delGen=1 size=19.546 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5q0(4.10.0):C13610/2625:delGen=7 size=12.480 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qv(4.10.0):C3973/1:delGen=1 size=3.402 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qo(4.10.0):C3456/2:delGen=2 size=3.147 MB 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qq(4.10.0):C1997 size=1.781 MB [floored] 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5q2(4.10.0):C779 size=1.554 MB [floored] 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qh(4.10.0):C1765/1:delGen=1 size=1.549 MB [floored] 14:16:52,778 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qc(4.10.0):C1828/1:delGen=1 size=1.401 MB [floored] 14:16:52,779 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qr(4.10.0):C1468 size=1.390 MB [floored] 14:16:52,779 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qp(4.10.0):C1729/1:delGen=1 size=1.351 MB [floored] 14:16:52,779 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qa(4.10.0):C967 size=1.235 MB [floored] 14:16:52,779 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qi(4.10.0):C1241 size=1.146 MB [floored] 14:16:52,779 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qy(4.10.0):C1407 size=1.050 MB [floored] 14:16:52,779 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5q1(4.10.0):C402/1:delGen=1 size=0.954 MB [floored] 14:16:52,779 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qw(4.10.0):C802/1:delGen=1 size=0.821 MB [floored] 14:16:52,779 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qx(4.10.0):C638 size=0.818 MB [floored] 14:16:52,779 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qt(4.10.0):C30 size=0.072 MB [floored] 14:16:52,779 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: seg=_5qu(4.10.0):C27 size=0.063 MB [floored] 14:16:52,779 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: allowedSegmentCount=31 vs count=34 (eligible count=32) tooBigCount=2 14:16:52,779 INFO [org.apache.solr.update.LoggingInfoStream] (recoveryExecutor-7-thread-1) [TMP][recoveryExecutor-7-thread-1]: maybe=_5co(4.10.0):C871785/93995:delGen=11 _5kb(4.10.0):C424868/49572:delGen=12 _5hm(4.10.0):C457977/83353:delGen=12 _56u(4.10.0):C286775/11906:delGen=15 _5f5(4.10.0):C116528/43621:delGen=2 _5m7(4.10.0):C122852/54010:delGen=12 _5p7(4.10.0):C37457/649:delGen=2 _5of(4.10.0):C38545/5677:delGen=1 _5qm(4.10.0):C29770/4:delGen=2 _5o8(4.10.0):C27155/7531:delGen=1 score=0.7313777596572674 skew=0.341 nonDelRatio=0.852 tooLarge=false size=2506.203 MB ..... 14:16:58,367 INFO [org.apache.solr.update.LoggingInfoStream] (searcherExecutor-6-thread-1) [IFD][searcherExecutor-6-thread-1]: delete "_5p7_1.del" 14:16:58,367 INFO [org.apache.solr.update.LoggingInfoStream] (searcherExecutor-6-thread-1) [IFD][searcherExecutor-6-thread-1]: delete "_5nx_1.del" 14:16:58,367 INFO [org.apache.solr.update.LoggingInfoStream] (searcherExecutor-6-thread-1) [IFD][searcherExecutor-6-thread-1]: delete "_5co_a.del" 14:16:58,367 INFO [org.apache.solr.update.LoggingInfoStream] (searcherExecutor-6-thread-1) [IFD][searcherExecutor-6-thread-1]: delete "_5q0_6.del" 14:16:58,367 INFO [org.apache.solr.update.LoggingInfoStream] (searcherExecutor-6-thread-1) [ -- ********************************************* Vijay Sekhri ********************************************* --001a1137db26faf7cf050d950657--