Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id AECBD200C32 for ; Thu, 9 Mar 2017 16:25:15 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id AD472160B67; Thu, 9 Mar 2017 15:25:15 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A7F9C160B64 for ; Thu, 9 Mar 2017 16:25:14 +0100 (CET) Received: (qmail 68343 invoked by uid 500); 9 Mar 2017 15:25:13 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 68331 invoked by uid 99); 9 Mar 2017 15:25:12 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Mar 2017 15:25:12 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 3F6CBC1343 for ; Thu, 9 Mar 2017 15:25:12 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.304 X-Spam-Level: X-Spam-Status: No, score=0.304 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.096, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id tVDYZqmx-RF2 for ; Thu, 9 Mar 2017 15:25:10 +0000 (UTC) Received: from mail-ua0-f176.google.com (mail-ua0-f176.google.com [209.85.217.176]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 21A085F647 for ; Thu, 9 Mar 2017 15:25:10 +0000 (UTC) Received: by mail-ua0-f176.google.com with SMTP id f54so78222479uaa.1 for ; Thu, 09 Mar 2017 07:25:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-transfer-encoding; bh=0aF2Q4SU9rWlco/RoJ9mPGY8WX2pUTjA5d6mZ2+HwHQ=; b=nedWP/0abDfKwmu8k+EFTQ2ZXJz0weUqYIUy3pOA0jb78s5KtkzVo0JLM9M5pQbd5b +CaZQEVeEe22Gg02SyCvo5mklfPRfv0BCbhgciONNJ9fx2ar5F+hSQRx2yZYD0ErARdp iBnuyV7M7B+5/ug4vrX5znYnVb9N+2BHGGIOzGLfeb1cwJUD6mMupCB14zd3PFkuzqHS r6ikHWNKFGLnsvTcElGdEU3aRazJ4FneUXk5MpoZ7b//K/gNqVVkzXOduyV10HxCXtyP uSIUK1VdxvvZf93xiI6yIdw9hhKo2UyA83U4Boe/vjGu87165zrzCybC/okC7gc3K/JU IMxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-transfer-encoding; bh=0aF2Q4SU9rWlco/RoJ9mPGY8WX2pUTjA5d6mZ2+HwHQ=; b=Zzq1BftaKs3TP/14M2gGwJJ1MaZsh8+0ME+ycv6Ujc4DvrICrTg0w23jwreXYZOGJv Gee9u8kOA8+M6Y/q3xkt6/OFs+be0Fson2L3ACuHJetCyuU4Ke7GAYArZZw+BZY8FUdm GZ0efUt2dVCqtXLnhcLd5AUFGFwDvvz+22NRwe4WXfe7bm9rD7Ko+wokCsVkEKj2cuUa Nsl0FmOYCk+Hshw5mzEKAGYwmasyUioji/NhcBeO/DbZCB8d3xDdURf17xOud5W/2uHT g3BPorwJoq9Gsc/f4BQSpa0k1p8s7QZME0TrxJdS7xXgB37mfdBPXiyfqepsRZc9mxQQ bIoQ== X-Gm-Message-State: AMke39ljhMw3i2O4X10g2ueqAL7RWxVdMTyvg3g6OEhhAh//4LyJkacVXJLY4t9MEWsnSvaty0H5E5NxlomGmQ== X-Received: by 10.31.52.83 with SMTP id b80mr5941484vka.146.1489073109310; Thu, 09 Mar 2017 07:25:09 -0800 (PST) MIME-Version: 1.0 Received: by 10.159.53.12 with HTTP; Thu, 9 Mar 2017 07:24:28 -0800 (PST) In-Reply-To: References: From: Alexandre Rafalovitch Date: Thu, 9 Mar 2017 10:24:28 -0500 Message-ID: Subject: Re: DIH Full Index Issue To: solr-user Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable archived-at: Thu, 09 Mar 2017 15:25:15 -0000 I am not super familiar with what happens with DIH in the SolrCloud and I was talking about HTTP access.log (which admittedly needs custom configuration). However, I do see this in the logs: 2017-03-09 13:40:00.356 INFO (qtp2080166188-41928) [c:collectionXXX s:shard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.c.S.Request [collectionXXX_shard1_replica2] webapp=3D/solr path=3D/dataimport params=3D{optimize=3Dtrue&commit=3Dtrue&clean=3Dtrue&command=3Dfull-import} status=3D0 QTime=3D3 .... 2017-03-09 13:40:26.115 INFO (qtp2080166188-39334) [c:collectionXXX s:shard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.u.p.LogUpdateProcessorFactory [collectionXXX_shard1_replica2] webapp=3D/solr path=3D/update params=3D{update.distrib=3DTOLEADER&distrib.from=3Dhttp://XXX.XXX.XXX.189:8= 080/solr/collectionXXX_shard1_replica1/&wt=3Djavabin&version=3D2}{} 0 0 .... 2017-03-09 13:40:26.131 INFO (qtp2080166188-34463) [c:collectionXXX s:shard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=3Dfalse,openSearcher=3Dtrue,waitSearcher=3Dtrue,expungeDel= etes=3Dfalse,softCommit=3Dfalse,prepareCommit=3Dfalse} It seems like there is a DIH start at 13:40:00, but then at 13:40:26 there is an update hit to another shard which propogates to this one and causes the commit making DIH changes at that point visible. Interestingly, there seem to be an error in that distributed update as I see "missing required field: adtype", which may mean the actual incoming records from other shards did not get into the index. So, to me, my theory stands so far and I would look for something sending - invalid - documents to shard1_replica1 at the same time as you are running DIH on shard1_replica2. Regards, Alex. ---- http://www.solr-start.com/ - Resources for Solr users, new and experienced On 9 March 2017 at 10:13, AJ Lemke wrote: > Thanks for the tip Alex. > > I forgot to mention that we are using Solr in cloud mode. > > I have checked the logs and I am seeing the following lines (this is an a= bridged snippet): > > 2017-03-09 13:40:00.356 INFO (qtp2080166188-41928) [c:collectionXXX s:sh= ard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.h.d.DataImporter L= oading DIH Configuration: data-config.xml > 2017-03-09 13:40:00.356 INFO (qtp2080166188-41928) [c:collectionXXX s:sh= ard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.h.d.DataImporter D= ata Configuration loaded successfully > 2017-03-09 13:40:00.356 INFO (Thread-17768) [c:collectionXXX s:shard1 r:= core_node1 x:collectionXXX_shard1_replica2] o.a.s.h.d.DataImporter Starting= Full Import > 2017-03-09 13:40:00.356 INFO (qtp2080166188-41928) [c:collectionXXX s:sh= ard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.c.S.Request [colle= ctionXXX_shard1_replica2] webapp=3D/solr path=3D/dataimport params=3D{opti= mize=3Dtrue&commit=3Dtrue&clean=3Dtrue&command=3Dfull-import} status=3D0 QT= ime=3D3 > 2017-03-09 13:40:00.450 INFO (Thread-17768) [c:collectionXXX s:shard1 r:= core_node1 x:collectionXXX_shard1_replica2] o.a.s.s.SolrIndexSearcher Openi= ng [Searcher@7c2e9318[collectionXXX_shard1_replica2] realtime] > 2017-03-09 13:40:00.560 INFO (Thread-17768) [c:collectionXXX s:shard1 r:= core_node1 x:collectionXXX_shard1_replica2] o.a.s.h.d.JdbcDataSource Creati= ng a connection for entity item with URL: jdbc:sqlserver://sqlX:XXX;databas= eName=3DXXXXXX;responseBuffering=3Dadaptive; > 2017-03-09 13:40:00.576 INFO (Thread-17768) [c:collectionXXX s:shard1 r:= core_node1 x:collectionXXX_shard1_replica2] o.a.s.h.d.JdbcDataSource Time t= aken for getConnection(): 13 > 2017-03-09 13:40:26.115 INFO (qtp2080166188-39334) [c:collectionXXX s:sh= ard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.u.p.LogUpdateProce= ssorFactory [collectionXXX_shard1_replica2] webapp=3D/solr path=3D/update = params=3D{update.distrib=3DTOLEADER&distrib.from=3Dhttp://XXX.XXX.XXX.189:8= 080/solr/collectionXXX_shard1_replica1/&wt=3Djavabin&version=3D2}{} 0 0 > 2017-03-09 13:40:26.115 ERROR (qtp2080166188-39334) [c:collectionXXX s:sh= ard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.h.RequestHandlerBa= se org.apache.solr.common.SolrException: [doc=3D11326865] missing required = field: adtype > 2017-03-09 13:40:26.131 INFO (qtp2080166188-34463) [c:collectionXXX s:sh= ard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.u.DirectUpdateHand= ler2 start commit{,optimize=3Dfalse,openSearcher=3Dtrue,waitSearcher=3Dtrue= ,expungeDeletes=3Dfalse,softCommit=3Dfalse,prepareCommit=3Dfalse} > 2017-03-09 13:40:26.131 INFO (qtp2080166188-34463) [c:collectionXXX s:sh= ard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.u.SolrIndexWriter = Calling setCommitData with IW:org.apache.solr.update.SolrIndexWriter@61db76= c2 > 2017-03-09 13:40:28.350 INFO (qtp2080166188-34463) [c:collectionXXX s:sh= ard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.s.SolrIndexSearche= r Opening [Searcher@a4ed679[collectionXXX_shard1_replica2] main] > 2017-03-09 13:40:28.350 INFO (searcherExecutor-50-thread-1-processing-n:= XXX.XXX.XXX.188:8080_solr x:collectionXXX_shard1_replica2 s:shard1 c:collec= tionXXX r:core_node1) [c:collectionXXX s:shard1 r:core_node1 x:collectionXX= X_shard1_replica2] o.a.s.c.QuerySenderListener QuerySenderListener sending = requests to Searcher@a4ed679[collectionXXX_shard1_replica2] main{ExitableDi= rectoryReader(UninvertingDirectoryReader(Uninverting(_oa1(6.3.0):C31847) Un= inverting(_oa2(6.3.0):C2520)))} > 2017-03-09 13:40:28.350 INFO (searcherExecutor-50-thread-1-processing-n:= XXX.XXX.XXX.188:8080_solr x:collectionXXX_shard1_replica2 s:shard1 c:collec= tionXXX r:core_node1) [c:collectionXXX s:shard1 r:core_node1 x:collectionXX= X_shard1_replica2] o.a.s.c.QuerySenderListener QuerySenderListener done. > 2017-03-09 13:40:28.350 INFO (searcherExecutor-50-thread-1-processing-n:= XXX.XXX.XXX.188:8080_solr x:collectionXXX_shard1_replica2 s:shard1 c:collec= tionXXX r:core_node1) [c:collectionXXX s:shard1 r:core_node1 x:collectionXX= X_shard1_replica2] o.a.s.c.SolrCore [collectionXXX_shard1_replica2] Registe= red new searcher Searcher@a4ed679[collectionXXX_shard1_replica2] main{Exita= bleDirectoryReader(UninvertingDirectoryReader(Uninverting(_oa1(6.3.0):C3184= 7) Uninverting(_oa2(6.3.0):C2520)))} > 2017-03-09 13:40:28.365 INFO (qtp2080166188-34463) [c:collectionXXX s:sh= ard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.u.DirectUpdateHand= ler2 end_commit_flush > 2017-03-09 13:40:28.444 INFO (qtp2080166188-34463) [c:collectionXXX s:sh= ard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.u.p.LogUpdateProce= ssorFactory [collectionXXX_shard1_replica2] webapp=3D/solr path=3D/update = params=3D{update.distrib=3DFROMLEADER&waitSearcher=3Dtrue&openSearcher=3Dtr= ue&commit=3Dtrue&softCommit=3Dfalse&distrib.from=3Dhttp://XXX.XXX.XXX.189:8= 080/solr/collectionXXX_shard1_replica1/&commit_end_point=3Dtrue&wt=3Djavabi= n&version=3D2&expungeDeletes=3Dfalse}{commit=3D} 0 2309 > 2017-03-09 13:40:56.459 INFO (qtp2080166188-40636) [c:collectionXXX s:sh= ard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.u.p.LogUpdateProce= ssorFactory [collectionXXX_shard1_replica2] webapp=3D/solr path=3D/update/= json params=3D{commit=3Dtrue&wt=3Djson}{} 0 0 > 2017-03-09 13:40:56.459 ERROR (qtp2080166188-40636) [c:collectionXXX s:sh= ard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.h.RequestHandlerBa= se org.apache.solr.common.SolrException: [doc=3D553094612] missing required= field: adtype > 2017-03-09 13:41:00.053 INFO (qtp2080166188-41928) [c:collectionXXX s:sh= ard1 r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.c.S.Request [colle= ctionXXX_shard1_replica2] webapp=3D/solr path=3D/schema params=3D{wt=3Djso= n} status=3D0 QTime=3D0 > > > > AJ > > -----Original Message----- > From: Alexandre Rafalovitch [mailto:arafalov@gmail.com] > Sent: Wednesday, March 8, 2017 9:33 AM > To: solr-user > Subject: Re: DIH Full Index Issue > > Are you perhaps indexing at the same time from the source other than DIH?= Because the commit is global and all the changes from all the sources will= become visible. > > Check the access logs perhaps to see the requests to /update handler or s= imilar. > > Regards, > Alex. > > > ---- > http://www.solr-start.com/ - Resources for Solr users, new and experience= d > > > On 8 March 2017 at 09:27, AJ Lemke wrote: >> Good Morning List! >> >> I have an issue where my DIH full index is committed after a minute of i= ndexing. >> My counts will fall from around 400K to 85K until the import is finished= , usually about four (4) minutes later. >> >> This is problematic for us as there are 315K missing items in our search= es. >> >> Versioning Info: >> solr-spec - 6.3.0 >> solr-impl - 6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - >> 2016-11-02 19:52:42 lucene-spec - 6.3.0 lucene-impl - 6.3.0 >> a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 >> 19:47:11 >> >> solrconfig.xml snippet >> > class=3D"solr.DirectUpdateHandler2"> >> >> ${solr.ulog.dir:} >> >> >> -1 >> false >> >> >> -1 >> >> >> >> >> Any insights would be greatly appreciated. >> Let me know if more information is required. >> >> AJ