Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 99FD8103F9 for ; Tue, 4 Mar 2014 20:01:25 +0000 (UTC) Received: (qmail 1788 invoked by uid 500); 4 Mar 2014 20:01:19 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 1730 invoked by uid 500); 4 Mar 2014 20:01:18 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 1722 invoked by uid 99); 4 Mar 2014 20:01:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Mar 2014 20:01:18 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of frank.geary@zoominfo.com designates 207.211.31.81 as permitted sender) Received: from [207.211.31.81] (HELO us-smtp-1.mimecast.com) (207.211.31.81) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Mar 2014 20:01:11 +0000 Received: from wnpexch01.eliyon.com (wnpexch01.eliyon.com [63.138.155.98]) (Using TLS) by us-mta-1.us.mimecast.lan; Tue, 04 Mar 2014 15:00:43 -0500 Received: from wnpexch01.eliyon.com ([10.31.248.87]) by wnpexch01.eliyon.com ([10.31.248.87]) with mapi; Tue, 4 Mar 2014 15:00:42 -0500 From: "Geary, Frank" To: "solr-user@lucene.apache.org" Date: Tue, 4 Mar 2014 15:00:41 -0500 Subject: RE: Solr 4.5.0 replication numDocs larger in slave Thread-Topic: Solr 4.5.0 replication numDocs larger in slave Thread-Index: Ac828V15Q8fvsgOXQTOMDPQjGiUDngACiWGwADj1ykA= Message-ID: <429FD1FCF833034092CC637C19A4072C61FEA4EAC0@wnpexch01.eliyon.com> References: <429FD1FCF833034092CC637C19A4072C61FE9CE853@wnpexch01.eliyon.com> <914E3990-8EFC-4220-94E0-F2BDE661C898@answers.com> <429FD1FCF833034092CC637C19A4072C61FEA4DE69@wnpexch01.eliyon.com> In-Reply-To: <429FD1FCF833034092CC637C19A4072C61FEA4DE69@wnpexch01.eliyon.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US MIME-Version: 1.0 X-MC-Unique: yOe+KGtRTA+z7hI9NSG+9Q-2 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Here's what I believe is my solution: =20 Yesterday I changed "nrtMode" to false in my solrconfig.xml (see the exampl= e solrconfig.xml for more info) on each master and slave server. And as of= today the numDocs are the same in each master/slave pair - but I'll contin= ue watching this for a bit. =20 Anyway, I believe that the numDocs in the master was jumping ahead of the s= lave due to the nrtMode being set to true (which is the default). Having n= rtMode set to true causes the IndexReaders to be reopened from the IndexWri= ter after the commit, and thus, if my guess is right, the IndexWriter is ef= fectively soft committing some adds and deletes on a normal basis, even tho= ugh I did not explicitly turn on any soft committing, that I know of. Then= anytime an IndexerReader is reopened based on the InderWriter, you will se= e those soft commits. But setting nrtMode to false causes the IndexReaders= to be reopened from the Directory which will never see those soft commits = - only the hard commits. And of course on the slave side, after replicatio= n, the slave never sees any soft commits, only hard commits from the Direct= ory. Frank=20 -----Original Message----- From: Geary, Frank=20 Sent: Monday, March 03, 2014 12:10 PM To: solr-user@lucene.apache.org Subject: RE: Solr 4.5.0 replication numDocs larger in slave Thanks Greg. We optimize the master once a week (early in the day Sunday) = and we do not do a commit Sunday evening (the only evening of the week when= we do not commit). So now after optimization/replication the master/slave= pair that were out on sync on Friday now have the same numDocs (and every = other value on the Overview page agrees except "size" under Replication whe= re it shows the slave is smaller). Unfortunately, a different master/slave= pair now have different numDocs after the optimize and replication done ye= sterday. =20 For the newly out of sync master/slave pair, the Version (Under Statistics = on the Overview page) is 4 revisions earlier on the slave than on the maste= r and there are two fewer segments on the slave than there are on the maste= r. Under Replication on the Overview page, the Versions and Gen's are all= the same, but the size of the slave is smaller than the master. The slave= has 51 fewer documents than the master. But indexing is continuing on th= e master (but no commit has happened since the optimization early Sunday.) I wonder if this is related to the NRT functionality in some way. I see "I= mpl: org.apache.solr.core.NRTCachingDirectoryFactory" on the Overview page.= I've been trying to rely on default behavior whenever possible. But perh= aps I need to turn something off?=20 Frank -----Original Message----- From: Greg Walters [mailto:greg.walters@answers.com] Sent: Monday, March 03, 2014 10:00 AM To: solr-user@lucene.apache.org Subject: Re: Solr 4.5.0 replication numDocs larger in slave I just ran into an issue similar to this that effected document scores on d= istributed searches. You might try doing an optimize and purging your delet= ed documents while no indexing is being done then checking your counts. Onc= e I optimized all my indexes the document counts on all of my cores matched= up and scoring was consistent. Thanks, Greg On Feb 28, 2014, at 8:22 PM, Erick Erickson wrote= : > That really shouldn't be happening IF indexing is shut off. Otherwise=20 > the slave is taking a snapshot of the master index and synching. >=20 > bq: The slave has about 33 more documents and one fewer segements=20 > (according to Overview in solr admin >=20 > Sounds like the master is still indexing and you've deleted documents=20 > on the master. >=20 > Best, > Erick >=20 >=20 > On Fri, Feb 28, 2014 at 11:08 AM, Geary, Frank = wrote: >=20 >> Hi, >>=20 >> I'm using Solr 4.5.0, I have a single master replicating to a single=20 >> slave. Only the master is being indexed to - never the slave. The=20 >> master is committed once each night. After the first commit and=20 >> replication the numDoc counts are identical. After the next nightly=20 >> commit and after the second replication a few minutes later, the=20 >> numDocs has increased in both the master and the slave as expected,=20 >> but numDocs is not the same in the master as it is in the slave. The=20 >> slave has about 33 more documents and one fewer segements (according to = Overview in solr admin). >>=20 >> I suspect the numDocs may be in sync again after tonight, but can anyone >> explain what is going on here? Is it possible a few deletions got >> committed to the master but not replicated to the slave? >>=20 >> Thanks >>=20 >> Frank >>=20 >>=20 >>=20