lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Nielsen ...@mcb.dk>
Subject Re: Strange data-loss problem on one of our cores
Date Fri, 14 Dec 2012 12:09:09 GMT
I'm building a simple tool which will help us monitor the solr cores for
this problem. Basically it does a q=*:* on both servers on each cores and
compares numFound of each result. Problem is that since this is a cloud
setup, i can't be sure which server gets me the result. Is there a
parameter I can add to the GET requests that will lock the request to a
specific node in the cluster, treating the server receiving the request as
a standalone server as opposed to a member of a cluster?

I tried googeling it without luck.



-- 
Med venlig hilsen / Best regards

*John Nielsen*
Programmer



*MCB A/S*
Enghaven 15
DK-7500 Holstebro

Kundeservice: +45 9610 2824
post@mcb.dk
www.mcb.dk



On Fri, Dec 14, 2012 at 12:36 PM, Markus Jelsma
<markus.jelsma@openindex.io>wrote:

> We did not solve it but reindexing can remedy the problem.
>
> -----Original message-----
> > From:John Nielsen <jn@mcb.dk>
> > Sent: Fri 14-Dec-2012 12:31
> > To: solr-user@lucene.apache.org
> > Subject: Re: Strange data-loss problem on one of our cores
> >
> > How did you solve the problem?
> >
> >
> > --
> > Med venlig hilsen / Best regards
> >
> > *John Nielsen*
> > Programmer
> >
> >
> >
> > *MCB A/S*
> > Enghaven 15
> > DK-7500 Holstebro
> >
> > Kundeservice: +45 9610 2824
> > post@mcb.dk
> > www.mcb.dk
> >
> >
> >
> > On Fri, Dec 14, 2012 at 12:04 PM, Markus Jelsma
> > <markus.jelsma@openindex.io>wrote:
> >
> > > FYI, we observe the same issue, after some time (days, months) a
> cluster
> > > running an older trunk version has at least two shards where the
> leader and
> > > the replica do not contain the same number of records. No recovery is
> > > attempted, it seems it thinks everything is alright. Also, one core of
> one
> > > of the unsynced shards waits forever loading
> > > /replication?command=detail&wt=json, other cores load it in a few ms.
> Both
> > > cores of another unsynced shard does not show this problem.
> > >
> > > -----Original message-----
> > > > From:John Nielsen <jn@mcb.dk>
> > > > Sent: Fri 14-Dec-2012 11:50
> > > > To: solr-user@lucene.apache.org
> > > > Subject: Re: Strange data-loss problem on one of our cores
> > > >
> > > > I did a manual commit, and we are still missing docs, so it doesn't
> look
> > > > like the search race condition you mention.
> > > >
> > > > My boss wasn't happy when i mentioned that I wanted to try out
> unreleased
> > > > code. Ill get him won over though and return with my findings. It
> will
> > > > probably be some time next week.
> > > >
> > > > Thanks for your help.
> > > >
> > > >
> > > > --
> > > > Med venlig hilsen / Best regards
> > > >
> > > > *John Nielsen*
> > > > Programmer
> > > >
> > > >
> > > >
> > > > *MCB A/S*
> > > > Enghaven 15
> > > > DK-7500 Holstebro
> > > >
> > > > Kundeservice: +45 9610 2824
> > > > post@mcb.dk
> > > > www.mcb.dk
> > > >
> > > >
> > > >
> > > > On Thu, Dec 13, 2012 at 4:10 PM, Mark Miller <markrmiller@gmail.com>
> > > wrote:
> > > >
> > > > > Couple things to start:
> > > > >
> > > > > By default SolrCloud distributes updates a doc at a time. So if you
> > > have 1
> > > > > shard, whatever node you index too, it will send updates to the
> other.
> > > > > Replication is only used for recovery, not distributing data. So
> for
> > > some
> > > > > reason, there is an IOException when it tries to forward.
> > > > >
> > > > > The other issue is not something that Ive seen reported. Can/did
> you
> > > try
> > > > > and do another hard commit to make sure you had the latest search
> open
> > > when
> > > > > checking the # of docs on each node? There was previously a race
> around
> > > > > commit that could cause some issues around expected visibility.
> > > > >
> > > > > If you are able to, you might try out a nightly build - 4.1 will
be
> > > ready
> > > > > very soon and has numerous bug fixes for SolrCloud.
> > > > >
> > > > > - Mark
> > > > >
> > > > > On Dec 13, 2012, at 9:53 AM, John Nielsen <jn@mcb.dk> wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > We are seeing a strange problem on our 2-node solr4 cluster.
This
> > > problem
> > > > > > has resultet in data loss.
> > > > > >
> > > > > > We have two servers, varnish01 and varnish02. Zookeeper is
> running on
> > > > > > varnish02, but in a separate jvm.
> > > > > >
> > > > > > We index directly to varnish02 and we read from varnish01. Data
> is
> > > thus
> > > > > > replicated from varnish02 to varnish01.
> > > > > >
> > > > > > I found this in the varnish01 log:
> > > > > >
> > > > > > *INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish02.lynero.net:8000/solr/default1_Norwegian/&update.distrib=TOLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=42
> > > > > > Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish02.lynero.net:8000/solr/default1_Norwegian/&update.distrib=TOLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=41
> > > > > > Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish02.lynero.net:8000/solr/default1_Norwegian/&update.distrib=TOLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=33
> > > > > > Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish02.lynero.net:8000/solr/default1_Norwegian/&update.distrib=TOLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=33
> > > > > > Dec 13, 2012 12:23:39 PM org.apache.solr.common.SolrException
log
> > > > > > SEVERE: shard update error StdNode:
> > > > > >
> > > > >
> > >
> http://varnish02.lynero.net:8000/solr/default1_Norwegian/:org.apache.solr.client.solrj.SolrServerException
> > > > > :
> > > > > > IOException occured when talking to server at:
> > > > > > http://varnish02.lynero.net:8000/solr/default1_Norwegian
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:413)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:335)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:309)
> > > > > >    at
> > > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> > > > > >    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> > > > > >    at
> > > > > >
> > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> > > > > >    at
> > > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> > > > > >    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> > > > > >    at
> > > > > >
> > > > >
> > >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> > > > > >    at
> > > > > >
> > > > >
> > >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> > > > > >    at java.lang.Thread.run(Thread.java:636)
> > > > > > Caused by: org.apache.http.NoHttpResponseException: The target
> server
> > > > > > failed to respond
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:101)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:282)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:216)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:647)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:464)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732)
> > > > > >    at
> > > > > >
> > > > >
> > >
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352)
> > > > > >    ... 11 more
> > > > > >
> > > > > > Dec 13, 2012 12:23:39 PM
> > > > > > org.apache.solr.update.processor.DistributedUpdateProcessor
> doFinish
> > > > > > INFO: try and ask http://varnish02.lynero.net:8000/solr to
> recover*
> > > > > >
> > > > > > It looks like it is sending updates from varnish01 to varnish02.
> I
> > > am not
> > > > > > sure for what since we only index on varnish02. Updates should
> never
> > > be
> > > > > > going from varnish01 to varnish02.
> > > > > >
> > > > > > Meanwhile on varnish02:
> > > > > >
> > > > > > *INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=16
> > > > > > Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=15
> > > > > > Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=16
> > > > > > Dec 13, 2012 12:23:42 PM
> > > org.apache.solr.handler.admin.CoreAdminHandler
> > > > > > handleRequestRecoveryAction
> > > > > > INFO: It has been requested that we recover*
> > > > > > *Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Danish] webapp=/solr path=/select
> > > > > >
> > > > >
> > >
> params={facet=false&sort=item_group_59700_name_int+asc,+variant_of_item_guid+asc&group.distributed.first=true&facet.limit=1000&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&version=2&df=text&fl=docid&shard.url=
> > > > > >
> > > > >
> > >
> varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397822111&group.field=groupby_variant_of_item_guid&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_59700_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&facet.mincount=1&qf=%0a++++++++++text
> > > > >
> > >
> ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=0&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true}
> > > > > > status=0 QTime=1
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Danish] webapp=/solr path=/select/
> > > > > > params={fq=site_guid:(2810678)&q=win} hits=0 status=0 QTime=17
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Danish] webapp=/solr path=/select
> > > > > >
> > > > >
> > >
> params={facet=on&sort=item_group_59700_name_int+asc,+variant_of_item_guid+asc&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&group.distributed.second=true&version=2&df=text&fl=docid&shard.url=
> > > > > >
> > > > >
> > >
> varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397822111&group.field=groupby_variant_of_item_guid&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_59700_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&qf=%0a++++++++++text
> > > > >
> > >
> ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=0&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true}
> > > > > > status=0 QTime=1
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Danish] webapp=/solr path=/select
> > > > > >
> > > > >
> > >
> params={facet=false&sort=item_group_59700_name_int+asc,+variant_of_item_guid+asc&group.distributed.first=true&facet.limit=1000&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&version=2&df=text&fl=docid&shard.url=
> > > > > >
> > > > >
> > >
> varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397822138&group.field=groupby_variant_of_item_guid&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_59700_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&facet.mincount=1&qf=%0a++++++++++text
> > > > >
> > >
> ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=40&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true}
> > > > > > status=0 QTime=1
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Danish] webapp=/solr path=/select
> > > > > >
> > > > >
> > >
> params={facet=on&sort=item_group_59700_name_int+asc,+variant_of_item_guid+asc&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&group.distributed.second=true&version=2&df=text&fl=docid&shard.url=
> > > > > >
> > > > >
> > >
> varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397822138&group.field=groupby_variant_of_item_guid&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_59700_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&group.topgroups.groupby_variant_of_item_guid=2963217&group.topgroups.groupby_variant_of_item_guid=2963223&group.topgroups.groupby_variant_of_item_guid=2963219&group.topgroups.groupby_variant_of_item_guid=2963220&group.topgroups.groupby_variant_of_item_guid=2963221&group.topgroups.groupby_variant_of_item_guid=2963222&group.topgroups.groupby_variant_of_item_guid=2963224&group.topgroups.groupby_variant_of_item_guid=2963218&qf=%0a++++++++++text
> > > > >
> > >
> ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=40&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true}
> > > > > > status=0 QTime=1
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=26
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=22
> > > > > > Dec 13, 2012 12:23:42 PM
> org.apache.solr.update.DefaultSolrCoreState
> > > > > > doRecovery
> > > > > > Dec 13, 2012 12:23:42 PM
> org.apache.solr.update.DefaultSolrCoreState
> > > > > > doRecovery
> > > > > > INFO: Running recovery - first canceling any ongoing recovery
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=25
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=24
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=20
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=25
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=23
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=21
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=23
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Norwegian] webapp=/solr path=/update
> > > > > params={distrib.from=
> > > > > >
> > > > >
> > >
> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
> > > > > }
> > > > > > status=0 QTime=16
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.cloud.RecoveryStrategy
> run
> > > > > > INFO: Starting recovery process.  core=default1_Norwegian
> > > > > > recoveringAfterStartup=false
> > > > > > Dec 13, 2012 12:23:42 PM
> org.apache.solr.common.cloud.ZkStateReader
> > > > > > updateClusterState
> > > > > > INFO: Updating cloud state from ZooKeeper...
> > > > > > Dec 13, 2012 12:23:42 PM
> > > > > > org.apache.solr.update.processor.LogUpdateProcessor finish*
> > > > > >
> > > > > > And less than a second later:
> > > > > >
> > > > > > *Dec 13, 2012 12:23:42 PM org.apache.solr.cloud.RecoveryStrategy
> > > > > doRecovery
> > > > > > INFO: Attempting to PeerSync from
> > > > > >
> > > > >
> > >
> http://varnish01.lynero.net:8000/solr/default1_Norwegian/core=default1_Norwegian
> > > > > > - recoveringAfterStartup=false
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.update.PeerSync sync
> > > > > > INFO: PeerSync: core=default1_Norwegian url=
> > > > > > http://varnish02.lynero.net:8000/solr START replicas=[
> > > > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/]
> > > nUpdates=100
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.update.PeerSync sync
> > > > > > WARNING: PeerSync: core=default1_Norwegian url=
> > > > > > http://varnish02.lynero.net:8000/solr too many updates received
> > > since
> > > > > start
> > > > > > - startingUpdates no longer overlaps with our currentUpdates
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.cloud.RecoveryStrategy
> > > > > doRecovery
> > > > > > INFO: PeerSync Recovery was not successful - trying replication.
> > > > > > core=default1_Norwegian
> > > > > > Dec 13, 2012 12:23:42 PM org.apache.solr.cloud.RecoveryStrategy
> > > > > doRecovery
> > > > > > INFO: Starting Replication Recovery. core=default1_Norwegian
> > > > > > Dec 13, 2012 12:23:42 PM
> > > org.apache.solr.client.solrj.impl.HttpClientUtil
> > > > > > createClient
> > > > > > INFO: Creating new http client,
> > > > > >
> > >
> config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
> > > > > > Dec 13, 2012 12:23:42 PM
> org.apache.solr.common.cloud.ZkStateReader$2
> > > > > > process
> > > > > > INFO: A cluster state change has occurred - updating...*
> > > > > >
> > > > > > State change on varnish01 at the same time:
> > > > > >
> > > > > > *Dec 13, 2012 12:23:42 PM
> > > org.apache.solr.common.cloud.ZkStateReader$2
> > > > > > process
> > > > > > INFO: A cluster state change has occurred - updating...*
> > > > > > *
> > > > > > *And a few seconds later on varnish02, the recovery finishes:
> > > > > > *
> > > > > > Dec 13, 2012 12:23:48 PM org.apache.solr.cloud.RecoveryStrategy
> > > > > doRecovery
> > > > > > INFO: Replication Recovery was successful - registering as
> Active.
> > > > > > core=default1_Norwegian
> > > > > > Dec 13, 2012 12:23:48 PM org.apache.solr.cloud.RecoveryStrategy
> > > > > doRecovery
> > > > > > INFO: Finished recovery process. core=default1_Norwegian
> > > > > > Dec 13, 2012 12:23:48 PM org.apache.solr.core.SolrCore execute
> > > > > > INFO: [default1_Danish] webapp=/solr path=/select
> > > > > >
> > > > >
> > >
> params={facet=false&sort=item_group_56823_name_int+asc,+variant_of_item_guid+asc&group.distributed.first=true&facet.limit=1000&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&version=2&df=text&fl=docid&shard.url=
> > > > > >
> > > > >
> > >
> varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397828395&group.field=groupby_variant_of_item_guid&facet.field=itemgroups_int_mv&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_56823_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&facet.mincount=1&qf=%0a++++++++++text
> > > > >
> > >
> ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=0&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true}
> > > > > > status=0 QTime=8
> > > > > > Dec 13, 2012 12:23:48 PM
> org.apache.solr.common.cloud.ZkStateReader
> > > > > > updateClusterState
> > > > > > INFO: Updating cloud state from ZooKeeper... *
> > > > > >
> > > > > > Which is picked up on varnish01:
> > > > > >
> > > > > > *Dec 13, 2012 12:23:48 PM
> > > org.apache.solr.common.cloud.ZkStateReader$2
> > > > > > process
> > > > > > INFO: A cluster state change has occurred - updating...*
> > > > > >
> > > > > > It looks like it replicated successfully, only it didnt. The
> > > > > > default1_Norwegian core on varnish01 now has 55.071 docs and
the
> same
> > > > > core
> > > > > > on varnish02 has 35.088 docs.
> > > > > >
> > > > > > I checked the log files for both JVM's and no stop-the-world
GC
> were
> > > > > taking
> > > > > > place.
> > > > > >
> > > > > > There is also nothing in the zookeeper log of interest that
I can
> > > see.
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Med venlig hilsen / Best regards
> > > > > >
> > > > > > *John Nielsen*
> > > > > > Programmer
> > > > > >
> > > > > >
> > > > > >
> > > > > > *MCB A/S*
> > > > > > Enghaven 15
> > > > > > DK-7500 Holstebro
> > > > > >
> > > > > > Kundeservice: +45 9610 2824
> > > > > > post@mcb.dk
> > > > > > www.mcb.dk
> > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message