lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amrit Sarkar <sarkaramr...@gmail.com>
Subject Re: Issue with CDCR bootstrapping in Solr 7.1
Date Thu, 30 Nov 2017 16:22:41 GMT
Hi Tom,

I see what you are saying and I too think this is a bug, but I will confirm
once on the code. Bootstrapping should happen on all the nodes of the
target.

Meanwhile can you index more than 100 documents in the source and do the
exact same experiment again. Followers will not copy the entire index of
Leader unless the difference in versions in docs are more than
"numRecordsToKeep", which is default 100, unless you have modified in
solrconfig.xml.

Looking forward to your analysis.

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2
Medium: https://medium.com/@sarkaramrit2

On Thu, Nov 30, 2017 at 9:03 PM, Tom Peters <tpeters@synacor.com> wrote:

> I'm running into an issue with the initial CDCR bootstrapping of an
> existing index. In short, after turning on CDCR only the leader replica in
> the target data center will have the documents replicated and it will not
> exist in any of the follower replicas in the target data center. All
> subsequent incremental updates made to the source datacenter will appear in
> all replicas in the target data center.
>
> A little more details:
>
> I have two clusters setup, a source cluster and a target cluster. Each
> cluster has only one shard and three replicas. I used the configuration
> detailed in the Source and Target sections of the reference guide as-is
> with the exception of updating the zkHost (https://lucene.apache.org/
> solr/guide/7_1/cross-data-center-replication-cdcr.html#
> cdcr-configuration-2).
>
> The source data center has the following nodes:
>         solr01-a, solr01-b, and solr01-c
>
> The target data center has the following nodes:
>         solr02-a, solr02-b, and solr02-c
>
> Here are the steps that I've done:
>
> 1. Create collection in source and target data centers
>
> 2. Add a number of documents to the source data center
>
> 3. Verify:
>
>     $ for i in solr0{1,2}-{a,b,c}; do echo -n "$i: "; curl -s
> $i:8080/solr/mycollection/select'?q=*:*' | jq '.response.numFound'; done
>     solr01-a: 81
>     solr01-b: 81
>     solr01-c: 81
>     solr02-a: 0
>     solr02-b: 0
>     solr02-c: 0
>
> 4. Start CDCR:
>
>     $ curl 'solr01-a:8080/solr/mycollection/cdcr?action=START'
>
> 5. See if target data center has received the initial index
>
>     $ for i in solr0{1,2}-{a,b,c}; do echo -n "$i: "; curl -s
> $i:8080/solr/mycollection/select'?q=*:*' | jq '.response.numFound'; done
>     solr01-a: 81
>     solr01-b: 81
>     solr01-c: 81
>     solr02-a: 0
>     solr02-b: 0
>     solr02-c: 81
>
>     note: only -c has received the index
>
> 6. Add another document to the source cluster
>
> 7. See how many documents are in each node:
>
>     $ for i in solr0{1,2}-{a,b,c}; do echo -n "$i: "; curl -s
> $i:8080/solr/mycollection/select'?q=*:*' | jq '.response.numFound'; done
>     solr01-a: 82
>     solr01-b: 82
>     solr01-c: 82
>     solr02-a: 1
>     solr02-b: 1
>     solr02-c: 82
>
>
> As you can see, the initial index only made it to one of the replicas in
> the target data center, but subsequent incremental updates have appeared
> everywhere I would expect. Any help would be greatly appreciated, thanks.
>
>
>
> This message and any attachment may contain information that is
> confidential and/or proprietary. Any use, disclosure, copying, storing, or
> distribution of this e-mail or any attached file by anyone other than the
> intended recipient is strictly prohibited. If you have received this
> message in error, please notify the sender by reply email and delete the
> message and any attachments. Thank you.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message