lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Makwana <AMakw...@capitalnovus.com>
Subject Fw: Indexing issue in Solrcloud
Date Fri, 04 Aug 2017 10:10:31 GMT


Hello,

We are facing one indexing issue in solrcloud.

Solr version: 6.3.0

Solr Cloud Configuration:

  *    3 node zookeeper with 3 solr instance.
  *   1 collection with 3 shards and 2 replicas.

[cid:8c971c4f-43e8-4b32-90c6-281b065301f9]

*
        Indexing 10000 document
        Indexed docs Shard1=3393, Shard2=3351, Shard3=3256

Testing:

*         Down any leader replica when indexing is running.

*   Connection refuse exception generated continuously in sendUpdateStream method of class
ConcurrentUpdateSolrClient.

*    Display 10000 records successfully indexed, while i execute all docs (*:*) query, solr
return only 6607 documents other 3393 documents not return in result.
Example:

            *    We gave dataimport command from devm50:8091, devm50:8091 solr instance is
responsible for document routing it will have indexed some of               the document in
its local index and some of them it will send to dedicated shards.

            *    Suppose we down devm50:8092 solr instance, it contains 1 leader replica of
shard1.so it will select devm50:8093 as leader replica.

            *    Data import screen display 10000 documents indexed/processed. But while searching
only 6607 document were indexed. And while searching in               shard1 0 documents are
indexed 3393 document of shard1 is not indexed. And Connection refused Exception is generated.



         Dataimport:

[cid:3ca59bce-ce43-4afd-827d-9f737cc652d5]


        *:* Query:

[cid:774a1643-bba4-4478-9bb5-dcd5cf237034]


       *:* Query in Shard1:

[cid:e8b99dd1-692b-4456-81a8-10346a363c57]

>From debugging code we come to know about that,


            *         In ConcurrentUpdateSolrClient class document send to dedicated shard
and its replica by solrj request and added in Runner queue and these                request
are run by scheduler later on.

            *         When any leader goes down, runner queue may have many requests that
point to old leader and it is down, it will try to send that request to old              
 leader and got connection refuse exception. Because we have not modified any pending request
though new leader is selected for that shard                but runner queue is holding client
request that point to old leader not new leader.

            *         There is no code for handling this issue, if exception generated from
ConcurrentUpdateSolrClient's sendInputStream(), it is not handle back to                it's
caller.

        We have tried to solve this issue by modifying ConcurrentUpdateSolrClient class.

            *          Added code to modify current client request when IOException generated
in ConcurrentUpdateSolrClient's sendInputStream method and                resubmit current
request to new leader node.

            *         By modifying code we are able to indexed some failed document but, some
documents are still missing now getting 9850 document while                searching.


Please find attachment for log


Amit Makwana | Software Engineear

CAPITAL NOVUS

Governance  |  Compliance  |  eDiscovery
A-501, APPL, IT-SEZ,
K Raheja Road, Koba, Gandhinagar: 382009
Office: 079.65721500 | Extn: 1646
AMakwana@capitalnovus.com<mailto:AMakwana@capitalnovus.com> | www.capitalnovus.com<http://www.capitalnovus.com/>

Washington, DC | New York | London | Paris | Gandhinagar | Tokyo
The information contained in this email message may be confidential or legally privileged.
If you are not the intended recipient, please advise the sender by replying to this email
and by immediately deleting all copies of this message and any attachments. Capital Legal
Solutions, LLC d/b/a Capital Novus is not authorized to practice law or provide legal services.
Its services are limited to the non-legal, administrative aspects of document review and discovery
projects.

Mime
View raw message