lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amrit Sarkar <sarkaramr...@gmail.com>
Subject Re: Issue with CDCR bootstrapping in Solr 7.1
Date Fri, 01 Dec 2017 05:52:29 GMT
Tom,

(and take care not to restart the leader node otherwise it will replicate
> from one of the replicas which is missing the index).

How is this possible? Ok I will look more into it. Appreciate if someone
else also chimes in if they have similar issue.

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2
Medium: https://medium.com/@sarkaramrit2

On Fri, Dec 1, 2017 at 4:49 AM, Tom Peters <tpeters@synacor.com> wrote:

> Hi Amrit, I tried issuing hard commits to the various nodes in the target
> cluster and it does not appear to cause the follower replicas to receive
> the initial index. The only way I can get the replicas to see the original
> index is by restarting those nodes (and take care not to restart the leader
> node otherwise it will replicate from one of the replicas which is missing
> the index).
>
>
> > On Nov 30, 2017, at 12:16 PM, Amrit Sarkar <sarkaramrit2@gmail.com>
> wrote:
> >
> > Tom,
> >
> > This is very useful:
> >
> >> I found a way to get the follower replicas to receive the documents from
> >> the leader in the target data center, I have to restart the solr
> instance
> >> running on that server. Not sure if this information helps at all.
> >
> >
> > You have to issue hardcommit on target after the bootstrapping is done.
> > Reloading makes the core opening a new searcher. While explicit commit is
> > issued at target leader after the BS is done, follower are left
> unattended
> > though the docs are copied over.
> >
> > Amrit Sarkar
> > Search Engineer
> > Lucidworks, Inc.
> > 415-589-9269
> > www.lucidworks.com
> > Twitter http://twitter.com/lucidworks
> > LinkedIn: https://www.linkedin.com/in/sarkaramrit2
> > Medium: https://medium.com/@sarkaramrit2
> >
> > On Thu, Nov 30, 2017 at 10:06 PM, Tom Peters <tpeters@synacor.com>
> wrote:
> >
> >> Hi Amrit,
> >>
> >> Starting with more documents doesn't appear to have made a difference.
> >> This time I tried with >1000 docs. Here are the steps I took:
> >>
> >> 1. Deleted the collection on both the source and target DCs.
> >>
> >> 2. Recreated the collections.
> >>
> >> 3. Indexed >1000 documents on source data center, hard commmit
> >>
> >>  $ for i in solr0{1,2}-{a,b,c}; do echo -n "$i: "; curl -s
> >> $i:8080/solr/mycollection/select'?q=*:*' | jq '.response.numFound';
> done
> >>  solr01-a: 1368
> >>  solr01-b: 1368
> >>  solr01-c: 1368
> >>  solr02-a: 0
> >>  solr02-b: 0
> >>  solr02-c: 0
> >>
> >> 4. Enabled CDCR and checked docs
> >>
> >>  $ curl 'solr01-a:8080/solr/synacor/cdcr?action=START'
> >>
> >>  $ for i in solr0{1,2}-{a,b,c}; do echo -n "$i: "; curl -s
> >> $i:8080/solr/mycollection/select'?q=*:*' | jq '.response.numFound';
> done
> >>  solr01-a: 1368
> >>  solr01-b: 1368
> >>  solr01-c: 1368
> >>  solr02-a: 0
> >>  solr02-b: 0
> >>  solr02-c: 1368
> >>
> >> Some additional notes:
> >>
> >> * I do not have numRecordsToKeep defined in my solrconfig.xml, so I
> assume
> >> it will use the default of 100
> >>
> >> * I found a way to get the follower replicas to receive the documents
> from
> >> the leader in the target data center, I have to restart the solr
> instance
> >> running on that server. Not sure if this information helps at all.
> >>
> >>> On Nov 30, 2017, at 11:22 AM, Amrit Sarkar <sarkaramrit2@gmail.com>
> >> wrote:
> >>>
> >>> Hi Tom,
> >>>
> >>> I see what you are saying and I too think this is a bug, but I will
> >> confirm
> >>> once on the code. Bootstrapping should happen on all the nodes of the
> >>> target.
> >>>
> >>> Meanwhile can you index more than 100 documents in the source and do
> the
> >>> exact same experiment again. Followers will not copy the entire index
> of
> >>> Leader unless the difference in versions in docs are more than
> >>> "numRecordsToKeep", which is default 100, unless you have modified in
> >>> solrconfig.xml.
> >>>
> >>> Looking forward to your analysis.
> >>>
> >>> Amrit Sarkar
> >>> Search Engineer
> >>> Lucidworks, Inc.
> >>> 415-589-9269
> >>> www.lucidworks.com
> >>> Twitter http://twitter.com/lucidworks
> >>> LinkedIn: https://www.linkedin.com/in/sarkaramrit2
> >>> Medium: https://medium.com/@sarkaramrit2
> >>>
> >>> On Thu, Nov 30, 2017 at 9:03 PM, Tom Peters <tpeters@synacor.com>
> wrote:
> >>>
> >>>> I'm running into an issue with the initial CDCR bootstrapping of an
> >>>> existing index. In short, after turning on CDCR only the leader
> replica
> >> in
> >>>> the target data center will have the documents replicated and it will
> >> not
> >>>> exist in any of the follower replicas in the target data center. All
> >>>> subsequent incremental updates made to the source datacenter will
> >> appear in
> >>>> all replicas in the target data center.
> >>>>
> >>>> A little more details:
> >>>>
> >>>> I have two clusters setup, a source cluster and a target cluster. Each
> >>>> cluster has only one shard and three replicas. I used the
> configuration
> >>>> detailed in the Source and Target sections of the reference guide
> as-is
> >>>> with the exception of updating the zkHost (https://lucene.apache.org/
> >>>> solr/guide/7_1/cross-data-center-replication-cdcr.html#
> >>>> cdcr-configuration-2).
> >>>>
> >>>> The source data center has the following nodes:
> >>>>       solr01-a, solr01-b, and solr01-c
> >>>>
> >>>> The target data center has the following nodes:
> >>>>       solr02-a, solr02-b, and solr02-c
> >>>>
> >>>> Here are the steps that I've done:
> >>>>
> >>>> 1. Create collection in source and target data centers
> >>>>
> >>>> 2. Add a number of documents to the source data center
> >>>>
> >>>> 3. Verify:
> >>>>
> >>>>   $ for i in solr0{1,2}-{a,b,c}; do echo -n "$i: "; curl -s
> >>>> $i:8080/solr/mycollection/select'?q=*:*' | jq '.response.numFound';
> >> done
> >>>>   solr01-a: 81
> >>>>   solr01-b: 81
> >>>>   solr01-c: 81
> >>>>   solr02-a: 0
> >>>>   solr02-b: 0
> >>>>   solr02-c: 0
> >>>>
> >>>> 4. Start CDCR:
> >>>>
> >>>>   $ curl 'solr01-a:8080/solr/mycollection/cdcr?action=START'
> >>>>
> >>>> 5. See if target data center has received the initial index
> >>>>
> >>>>   $ for i in solr0{1,2}-{a,b,c}; do echo -n "$i: "; curl -s
> >>>> $i:8080/solr/mycollection/select'?q=*:*' | jq '.response.numFound';
> >> done
> >>>>   solr01-a: 81
> >>>>   solr01-b: 81
> >>>>   solr01-c: 81
> >>>>   solr02-a: 0
> >>>>   solr02-b: 0
> >>>>   solr02-c: 81
> >>>>
> >>>>   note: only -c has received the index
> >>>>
> >>>> 6. Add another document to the source cluster
> >>>>
> >>>> 7. See how many documents are in each node:
> >>>>
> >>>>   $ for i in solr0{1,2}-{a,b,c}; do echo -n "$i: "; curl -s
> >>>> $i:8080/solr/mycollection/select'?q=*:*' | jq '.response.numFound';
> >> done
> >>>>   solr01-a: 82
> >>>>   solr01-b: 82
> >>>>   solr01-c: 82
> >>>>   solr02-a: 1
> >>>>   solr02-b: 1
> >>>>   solr02-c: 82
> >>>>
> >>>>
> >>>> As you can see, the initial index only made it to one of the replicas
> in
> >>>> the target data center, but subsequent incremental updates have
> appeared
> >>>> everywhere I would expect. Any help would be greatly appreciated,
> >> thanks.
> >>>>
> >>>>
> >>>>
> >>>> This message and any attachment may contain information that is
> >>>> confidential and/or proprietary. Any use, disclosure, copying,
> storing,
> >> or
> >>>> distribution of this e-mail or any attached file by anyone other than
> >> the
> >>>> intended recipient is strictly prohibited. If you have received this
> >>>> message in error, please notify the sender by reply email and delete
> the
> >>>> message and any attachments. Thank you.
> >>>>
> >>
> >>
> >>
> >> This message and any attachment may contain information that is
> >> confidential and/or proprietary. Any use, disclosure, copying, storing,
> or
> >> distribution of this e-mail or any attached file by anyone other than
> the
> >> intended recipient is strictly prohibited. If you have received this
> >> message in error, please notify the sender by reply email and delete the
> >> message and any attachments. Thank you.
> >>
>
>
>
> This message and any attachment may contain information that is
> confidential and/or proprietary. Any use, disclosure, copying, storing, or
> distribution of this e-mail or any attached file by anyone other than the
> intended recipient is strictly prohibited. If you have received this
> message in error, please notify the sender by reply email and delete the
> message and any attachments. Thank you.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message