Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8C325200B87 for ; Mon, 5 Sep 2016 04:43:43 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 8AA37160AC0; Mon, 5 Sep 2016 02:43:43 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 58C36160AB9 for ; Mon, 5 Sep 2016 04:43:42 +0200 (CEST) Received: (qmail 30178 invoked by uid 500); 5 Sep 2016 02:43:39 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 30166 invoked by uid 99); 5 Sep 2016 02:43:39 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Sep 2016 02:43:39 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 15AA4C021B for ; Mon, 5 Sep 2016 02:43:39 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.429 X-Spam-Level: X-Spam-Status: No, score=0.429 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_REPLY=1, KAM_LOTSOFHASH=0.25, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 2R3vrJBvng2O for ; Mon, 5 Sep 2016 02:43:34 +0000 (UTC) Received: from mail-it0-f46.google.com (mail-it0-f46.google.com [209.85.214.46]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 668A75FE3D for ; Mon, 5 Sep 2016 02:43:33 +0000 (UTC) Received: by mail-it0-f46.google.com with SMTP id e124so127612999ith.0 for ; Sun, 04 Sep 2016 19:43:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=eFiHcVBbB6EuOE4G09SoHGi0GVAK+hpzDRk8wBqsfQQ=; b=uD3fCBEy6WyHaV+imawoHKDMYJwAjKiTgYv3HFxWDS9/8dDnP/7ZbC23IpYAcBPEca 9LNUfpuufKE2a8o9pFSrzhLTEmcd1kbSWYB9VVyzGcyNdzxwOioDNCaCJFkjZ2ZCNNrV oRN2faoULRYyxn5kCBjX3wbql46WDPyQfCUwj1BNSv1ux4sIrW0sqGH6T0CjIP5cy9S0 dLF3NdFRp76CCPC5RenYxWDbI/BkfkugtQE6/vR/6lDZBY/ww942BYpGeP8dlIH3/mIl s7i0pAIOB8NpUvLggCdqTZLfIPs+hSSNHZKdzWOLZM3DHH3JhMseJThFOjLRM6cVM71G VL/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=eFiHcVBbB6EuOE4G09SoHGi0GVAK+hpzDRk8wBqsfQQ=; b=ULBZBRk/lRWwRw1988WK51djl++oY18icEG7CsohBhm6cK6Q3gTmSafGKqK8UEt64e m2+HV71EAjOBwLrWS6GxNwF6CUmmlJvXEMpYS9DImxIb1kNLXQhyV6VRVVEw+vh/N9OZ Ctq0pYGDx/jQw0u7XEW99srIkL8cVdaMrv41FhGuev8pBvyQp/Sh0dyTzPifnkpNsjwf Y7pIaa1Ydn7Wqx/nW5y0aJJGA+r4cMip+2VUj4P3Dqms+t+xGt9mpZ7Ht9R9t/Meh9lw wt+uu1RmnCE8sI5EXN8hJ+394YvFiqfCtzz+rofh7iZQYx9r/2CLBK/pCw48yGgDiQ1n uviw== X-Gm-Message-State: AE9vXwNt54IHVR0rfheKF2loX8bYV4C/d6hK5hDaJbI1L42PiSviNZOYjqSnV0TrmmurFF7Uj04Ty8dLzLIVGw== X-Received: by 10.36.13.203 with SMTP id 194mr18898679itx.79.1473043412013; Sun, 04 Sep 2016 19:43:32 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.212.6 with HTTP; Sun, 4 Sep 2016 19:42:51 -0700 (PDT) In-Reply-To: References: From: Alexandre Rafalovitch Date: Mon, 5 Sep 2016 09:42:51 +0700 Message-ID: Subject: Re: Solr document missing or not getting indexed though we get 200 ok status from server To: solr-user Content-Type: text/plain; charset=UTF-8 archived-at: Mon, 05 Sep 2016 02:43:43 -0000 I can't tell anything from the document provided. So, here would be my thoughts: If what you see is some sort of concurrency issues, the documents missed/dropped would unlikely be exactly the same ones. So, if you see the same documents dropped, it is much more likely to be something to do with documents, with handler end-points, with sharding, etc. If this is easily reproducible, I would run a network analyzer such as Wireshark and compare your Admin UI session with your client session and verify that everything expected is absolutely identical. You could also temporarily turn on Debug via Admin console (under logs). You could turn individual elements to Trace to get low-level information on what's happening. Finally, I am assuming this is all happening with latest Solr? If not, it may be worth trying that and/or checking Jira for bugs. Lots of things have been fixed/improved in more recent Solr related to multi-threaded, multi-server setups. Regards, Alex. ---- Newsletter and resources for Solr beginners and intermediates: http://www.solr-start.com/ On 5 September 2016 at 00:17, Ganesh M wrote: > Hi Alex, > We tried to post the same manually from SOLR ADMIN / documents UI. It got > indexed successfully. We are sure that it's not duplicate issue. We are > using default update handler and doesn't configure for custom one. We fire > the request to index using direct HTTP request using XML > format. We are getting 200 OK response. But not getting indexed. > > This is the request we fired and got 200. But not getting indexed. Same > request fired via SOLR ADMIN / Document UI, it's getting indexed > successfully. > > > false > 55788327 > false > Factuur _PERF29161663_Voor _Va Bene.pdf > 55788327-PERF29161663 > 3.00 > 2916847 > STCUA0000021500000011472808279078 > EUR > 50.00 > VAT > 50.00 > UA000002150000001:VB1 > VB1:A000002150:vbgroupnft+1:1472808278137 > RA000002150AT009428 > 100000,false > 62440101 > UNKNOWN > RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf# > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f > RA000002150AT009425#pdf.pdf# > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278843.png#f > 1472808279002 > CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001 > PERF2020916145437 LEA0000021509223370564294752110EXCC2000001 Va Bene VA > Beheer B.V. LEA0000021509223370564294689844EXCC1000001 VA Beheer B.V. VA > Beheer B.V.null null null 2.1null urn:www.cenbii.eu: > transaction:biicoretrdm010:ver1.0:#urn:www.peppol.eu: > bis:peppol4a:ver1.0#urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.xnull > urn:www.cenbii.eu:profile:bii04:ver2.0null PERF20209161454372 null > 1472754600000null 3806 UNCL1001 null EUR6 ISO 4217 Alpha null null > 29168472 null null pdf.pdf2 null null RA000002150AT009425#pdf.pdf# > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278843.png#fpdf.pdf > application/pdf null null Factuur _PERF29161663_Voor _Va Bene.pdf2 null > PrimaryImagenull null RA000002150AT009424#Factuur _PERF29161663_Voor _Va > Bene.pdf# > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#fFactuur > _PERF29161663_Voor _Va Bene.pdf application/pdf null null null 62440101ZZZ > NL:KVK null null 2916847ZZZ NL:VAT null null VA Beheer B.V.null null > Schurinkstraatnull 23null Ommennull 7731GCnull null NL6 > ISO3166-1:Alpha2 null null 2916847ZZZ NL:VAT null null VAT6 UN/ECE 5153 > null null 62440101ZZZ NL:KVK null null null 55788327ZZZ NL:KVK null null > 55788327ZZZ NL:KVK null null Va Benenull null Voorstraatnull 26null > Voorschotennull 2251BNnull null NL6 ISO3166-1:Alpha2 null null > 2916847ZZZ NL:VAT null null VAT6 UN/ECE 5153 null null 55788327ZZZ > NL:KVK null null 1475173800000null null null null NL6 ISO3166-1:Alpha2 > null null 316 UNCL4461 null 1475087400000null 55788327-PERF29161663null > null 29168472 IBAN null UNKNOWNBIC null Betaling?binnen?14?dagen op > bankrekening?2916847?onder vermelding van?55788327/PERF29161663null null > 3.00EUR null null 50.00EUR null 3.00EUR null null S6 UNCL5305 null > 6.00null null VAT6 UN/ECE 5153 null null 50.00EUR null 50.00EUR null > 53.00EUR null 53.00EUR null null 102 null 5.00BX null 50.00EUR null > null PERF2020916145437null PERF2020916145437null null 12 null null S6 > UNCL5305 null 6.00null null VAT6 UN/ECE 5153 null null 10.00EUR null > RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf# > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f > DM001 XCNIN199751 NL:KVK:62440101 false false false false 10 > UA000002150000001:VB1 VB1:A000002150:vbgroupnft+1:1472808278137 Ontvangen > 1472808279002 Factuur GLDT9223370666504283001RA000000006DTP2000001 VB1 VB1 > UA000002150000001 RA000002150AT009428 vbgroupnft+1 A000002150 Group > 55788327 Va Bene XCNL034435 Va Bene > LEA0000021509223370564294752110EXCC2000001 vbgroupnft+1 A000002150 > PERF2020916145437 Group 62440101 VA Beheer B.V. XCNL034436 VA Beheer B.V. > LEA0000021509223370564294689844EXCC1000001 > STCUA0000021500000011472808279078 VB1 VB1 VB1 VB1 UA000002150000001 true > Factuur GLDT9223370666504283001RA000000006DTP2000001 EM0001 > NL:KVK:55788327 > vbgroupnft+1 > 10 > Va Bene > true > 50.00 > NL > 100000,false > 62440101 > XCNL034436 > 1475087400000 > Factuur > 2916847 > VB1 VB1 > 31 > NL > 53.00 > Ontvangen > 26 > CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001 > XCNL034435 > VAT > VA Beheer B.V. > VA Beheer B.V. > 2251BN > Va Bene > false > VB1 VB1 > 55788327 > 2.1 > PERF2020916145437 > 1475173800000 > EM0001 > PERF2020916145437 > PERF2020916145437 > urn:www.cenbii.eu:profile:bii04:ver2.0 > Betaling?binnen?14?dagen op bankrekening?2916847?onder > vermelding van?55788327/PERF29161663 > 1472754600000 > urn:www.cenbii.eu: > transaction:biicoretrdm010:ver1.0:#urn:www.peppol.eu: > bis:peppol4a:ver1.0#urn:www.simplerinvoicing.org: > si:si-ubl:ver1.1.x > 5.00 > UA000002150000001 > 55788327 > Group > NL:KVK:55788327 > LEA0000021509223370564294752110EXCC2000001 > LEA0000021509223370564294689844EXCC1000001 > CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001 > NL:KVK:62440101 > VA Beheer B.V. > 1472808279002 > 10.00 > Factuur > A000002150 > 62440101 > UA000002150000001 > A000002150 > GLDT9223370666504283001RA000000006DTP2000001 > DM001 > 55788327 > VAT > 1 > VAT > 2916847 > XCNIN199751 > VB1 VB1 > PERF2020916145437 > RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf# > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f > > Ommen > 6.00 > VA Beheer B.V. > 53.00 > Group > GLDT9223370666504283001RA000000006DTP2000001 > S > Va Bene > 2916847 > 23 > 100000,false > PrimaryImage > NL > 7731GC > CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001 > false > Voorschoten > RA000002150AT009425#pdf.pdf# > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278843.png#f > > RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf# > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f > > CLEA0000021509223370564294689844EXCC10000019223370564046496793C1LEA0000021509223370564294752110EXCC2000001 > false > 2916847 > 1472808279002 > Schurinkstraat > LEA0000021509223370564294689844EXCC1000001 > Va Bene > 3.00 > 10 > 100000,false > S > PERF2020916145437 > vbgroupnft+1 > false > 380 > 50.00 > Voorstraat > RA000002150AT009424#Factuur _PERF29161663_Voor _Va Bene.pdf# > http://srv-cbe-col1.everbinding.com/thumbs/2016/9/2/A000002150/UA000002150000001/1472808278632.png#f > > 6.00 > LEA0000021509223370564294752110EXCC2000001 > > > > > Only difference is when we post via manually via SOLR ADMIN, it's fired > when there is no concurrency. But initially there would be around 50 > threads firing update POST request and also few threads fire's GET request > to different collections. > Little more information about the setup.... > We have around 5 Collection and each collection has 2 shards ( one shard in > each node, one shard for index and other for replica), totally 2 nodes with > master master setup. > > We are getting this error only when there is concurrency of of around 50 > threads firing POST request to various collections same time. > > Strange thing is why SOLR not returning error when it's not able to index > it. If SOLR has returned error, we could have retry the document indexing. > Is there any way we can make SOLR to return error instead of 200 when they > fail to index ? > > Regards, > Ganesh > > On Sun, Sep 4, 2016 at 10:11 PM Alexandre Rafalovitch > wrote: > >> Can you identify the specific documents that 'fail'? What happens if >> you post them manually? Try posting them manually but with one field >> super-distinct to see whether it made it in. What happens if you post >> it to an empty index (copy definition and try). >> >> Also, what's your request handler's parameters look like. Perhaps you >> have a signature processor, in which case it may be triggering >> duplicates avoidance with different calculation from just an id. >> >> My guess is still that it is some sort of duplicate issue. >> >> Regards, >> Alex. >> ---- >> Newsletter and resources for Solr beginners and intermediates: >> http://www.solr-start.com/ >> >> >> On 4 September 2016 at 23:10, Ganesh M wrote: >> > Some more information on this... Most of documents get indexed properly. >> Few documents are not getting indexed. >> > >> > All documents POST are seen in the localhost_access and 200 OK response >> is seen in local host access file. But in catalina, there are some >> difference in the logs for which are indexing properly, following is the >> logs. >> > >> > FINE: PRE_UPDATE add >> > >> {,id=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001} >> > >> params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml) >> > Sep 01, 2016 7:39:31 AM org.apache.solr.update.TransactionLog >> > FINE: New TransactionLog >> file=/ebdata2/solrdata/IOB_shard1_replica1/data/tlog/tlog.0000000000000220856, >> exists=false, size=0, openExisting=false >> > Sep 01, 2016 7:39:31 AM org.apache.solr.update.SolrCmdDistributor submit >> > FINE: sending update to >> http://xx.xx.xx.xx:7070/solr/IOB_shard1_replica2/ retry:0 >> add{version=1544254202941800448,id=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001} >> params:update.distrib=FROMLEADER&distrib.from=http%3A%2F%2Fxx.xx.xx.xx%3A7070%2Fsolr%2FIOB_shard1_replica1%2F >> > Sep 01, 2016 7:39:31 AM >> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run >> > FINE: starting runner: >> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner@3fb794b2 >> > Sep 01, 2016 7:39:31 AM >> org.apache.solr.update.processor.LogUpdateProcessor finish >> > FINE: PRE_UPDATE FINISH >> params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml) >> > Sep 01, 2016 7:39:31 AM >> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner run >> > FINE: finished: >> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner@3fb794b2 >> > Sep 01, 2016 7:39:31 AM >> org.apache.solr.update.processor.LogUpdateProcessor finish >> > INFO: [IOB_shard1_replica1] webapp=/solr path=/update params= >> > >> {crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001} >> > >> {add=[CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001 >> (1544254202941800448)]} >> > Sep 01, 2016 7:39:31 AM org.apache.solr.servlet.SolrDispatchFilter >> doFilter >> > FINE: Closing out SolrRequest: >> params(crid=CUA0000004390000019223370564139207241C3LEA0000020769223370567404392838EXCC3000001),defaults(wt=xml) >> > ------------------------------------------------- >> > >> > For the one which document is not getting indexed, we could see only >> following log in catalina.out. Not sure whether it's getting added to SOLR. >> > >> > >> > Sep 01, 2016 7:39:56 AM >> org.apache.solr.update.processor.LogUpdateProcessor finish >> > FINE: PRE_UPDATE FINISH >> params(crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002),defaults(wt=xml) >> > Sep 01, 2016 7:39:56 AM >> org.apache.solr.update.processor.LogUpdateProcessor finish >> > INFO: [IOB_shard1_replica1] webapp=/solr path=/update params= >> > >> {crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002} >> > {} 0 1 >> > Sep 01, 2016 7:39:56 AM org.apache.solr.servlet.SolrDispatchFilter >> doFilter >> > FINE: Closing out SolrRequest: >> params(crid=CUA0000004390000019223370564139182810C3LEA0000020179223370567061972057EXCC1000002),defaults(wt=xml) >> > >> > ---------------------- >> > >> > You can see that in above log for missing documents ( which is not >> indexed), in catalina log, we are not seeing "PRE UPDATE ADD". Is that >> causing / reason for document not getting indexed ? >> > >> > We have set autosoftcommit to 1 seconds and autohardcommit to 30 seconds. >> > >> > We are not getting any errors or exceptions in the log. >> > >> > This issue is becoming very critical and sort of reliable factor. Though >> we get 200 OK response from SOLR for update HTTP POST request, nothing >> happens on the SOLR side. If SOLR is not able to process, isn't it we get >> error from SOLR instead of giving 200 OK response. >> > >> > Anybody has faced this sort of issue or any sort of help would be very >> much appreciated. >> > >> > >> > >> > >> > On Sun, Sep 4, 2016 at 12:59 PM Ganesh M > mganeshs@live.in>> wrote: >> > Nitin, Thanks for reply. Our each document has unique id and its hbase >> rowkey id. So it will be unique only. So there is no chance of duplicates >> id being send. >> > >> > >> > >> > On Sun 4 Sep, 2016 12:41 pm Nitin Kumar, > > wrote: >> > Please check doc's unique key(Id). All keys shd be unique. Else docs >> having >> > same id will be replaced. >> > >> > On 04-Sep-2016 12:13 PM, "Ganesh M" > mganeshs@live.in>> wrote: >> > >> >> Hi, >> >> we are keep sending documents to Solr from our app server. Single >> document >> >> per request, but in parallel of 10 request hits solr cloud in a second. >> >> >> >> We could see our post request ( update request ) hitting our solr 5.4 in >> >> localhost_access logs, and it's response as 200 Ok response. And also we >> >> get HTTP 200 OK response to our app servers as well for out HTTP >> request we >> >> fired to SOLR Cloud. >> >> >> >> But few documents are not getting indexed. Out of 2000 documents we sent >> >> 10 documents are getting missed. Thought there is not error, few >> documents >> >> are getting missed. >> >> >> >> We use autoSoftcommit as 2 secs and autohardcommit as 30 secs. >> >> >> >> Why is that 10 documents not getting indexed and also no error getting >> >> thrown back if server is not able to index it ? >> >> >> >> Regards, >> >> >> >> >> >> >> >> >>