lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamie Johnson <jej2...@gmail.com>
Subject Re: Solr 4.2 Cloud Replication Replica has higher version than Master?
Date Tue, 02 Apr 2013 21:43:40 GMT
here is another one that looks interesting

Apr 2, 2013 7:27:14 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: ClusterState says we are the
leader, but locally we don't think so
        at
org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:293)
        at
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:228)
        at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:339)
        at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
        at
org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
        at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
        at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
        at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
        at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
        at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:637)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)



On Tue, Apr 2, 2013 at 5:41 PM, Jamie Johnson <jej2003@gmail.com> wrote:

> Looking at the master it looks like at some point there were shards that
> went down.  I am seeing things like what is below.
>
> NFO: A cluster state change: WatchedEvent state:SyncConnected
> type:NodeChildrenChanged path:/live_nodes, has occurred - updating... (live
> nodes size: 12)
> Apr 2, 2013 8:12:52 PM org.apache.solr.common.cloud.ZkStateReader$3 process
> INFO: Updating live nodes... (9)
> Apr 2, 2013 8:12:52 PM org.apache.solr.cloud.ShardLeaderElectionContext
> runLeaderProcess
> INFO: Running the leader process.
> Apr 2, 2013 8:12:52 PM org.apache.solr.cloud.ShardLeaderElectionContext
> shouldIBeLeader
> INFO: Checking if I should try and be the leader.
> Apr 2, 2013 8:12:52 PM org.apache.solr.cloud.ShardLeaderElectionContext
> shouldIBeLeader
> INFO: My last published State was Active, it's okay to be the leader.
> Apr 2, 2013 8:12:52 PM org.apache.solr.cloud.ShardLeaderElectionContext
> runLeaderProcess
> INFO: I may be the new leader - try and sync
>
>
>
> On Tue, Apr 2, 2013 at 5:09 PM, Mark Miller <markrmiller@gmail.com> wrote:
>
>> I don't think the versions you are thinking of apply here. Peersync does
>> not look at that - it looks at version numbers for updates in the
>> transaction log - it compares the last 100 of them on leader and replica.
>> What it's saying is that the replica seems to have versions that the leader
>> does not. Have you scanned the logs for any interesting exceptions?
>>
>> Did the leader change during the heavy indexing? Did any zk session
>> timeouts occur?
>>
>> - Mark
>>
>> On Apr 2, 2013, at 4:52 PM, Jamie Johnson <jej2003@gmail.com> wrote:
>>
>> > I am currently looking at moving our Solr cluster to 4.2 and noticed a
>> > strange issue while testing today.  Specifically the replica has a
>> higher
>> > version than the master which is causing the index to not replicate.
>> > Because of this the replica has fewer documents than the master.  What
>> > could cause this and how can I resolve it short of taking down the index
>> > and scping the right version in?
>> >
>> > MASTER:
>> > Last Modified:about an hour ago
>> > Num Docs:164880
>> > Max Doc:164880
>> > Deleted Docs:0
>> > Version:2387
>> > Segment Count:23
>> >
>> > REPLICA:
>> > Last Modified: about an hour ago
>> > Num Docs:164773
>> > Max Doc:164773
>> > Deleted Docs:0
>> > Version:3001
>> > Segment Count:30
>> >
>> > in the replicas log it says this:
>> >
>> > INFO: Creating new http client,
>> >
>> config:maxConnectionsPerHost=20&maxConnections=10000&connTimeout=30000&socketTimeout=30000&retry=false
>> >
>> > Apr 2, 2013 8:15:06 PM org.apache.solr.update.PeerSync sync
>> >
>> > INFO: PeerSync: core=dsc-shard5-core2
>> > url=http://10.38.33.17:7577/solrSTART replicas=[
>> > http://10.38.33.16:7575/solr/dsc-shard5-core1/] nUpdates=100
>> >
>> > Apr 2, 2013 8:15:06 PM org.apache.solr.update.PeerSync handleVersions
>> >
>> > INFO: PeerSync: core=dsc-shard5-core2 url=http://10.38.33.17:7577/solr
>> > Received 100 versions from 10.38.33.16:7575/solr/dsc-shard5-core1/
>> >
>> > Apr 2, 2013 8:15:06 PM org.apache.solr.update.PeerSync handleVersions
>> >
>> > INFO: PeerSync: core=dsc-shard5-core2 url=http://10.38.33.17:7577/solr Our
>> > versions are newer. ourLowThreshold=1431233788792274944
>> > otherHigh=1431233789440294912
>> >
>> > Apr 2, 2013 8:15:06 PM org.apache.solr.update.PeerSync sync
>> >
>> > INFO: PeerSync: core=dsc-shard5-core2
>> > url=http://10.38.33.17:7577/solrDONE. sync succeeded
>> >
>> >
>> > which again seems to point that it thinks it has a newer version of the
>> > index so it aborts.  This happened while having 10 threads indexing
>> 10,000
>> > items writing to a 6 shard (1 replica each) cluster.  Any thoughts on
>> this
>> > or what I should look for would be appreciated.
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message