lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: shard1 gone missing ...
Date Fri, 31 Jan 2014 15:22:55 GMT
Would probably need to see some logs to have an idea of what happened.

Would also be nice to see the after state of zk in a text dump.

You should be able to fix it, as long as you have the index on a disk, just make sure it is
where it is expected and manually update the clusterstate.json. Would be good to take a look
at the logs and see if it tells anything first though.

I’d also highly recommend you try moving to Solr 4.6.1 when you can though. We have fixed
many, many, many bugs around SolrCloud in the 4 releases since 4.4. You can follow the progress
in the CHANGES file we update for each release.

I wrote a little about the 4.6.1 as it relates to SolrCloud here: https://plus.google.com/+MarkMillerMan/posts/CigxUPN4hbA

- Mark

http://about.me/markrmiller

On Jan 31, 2014, at 10:13 AM, David Santamauro <david.santamauro@gmail.com> wrote:

> 
> Hi,
> 
> I have a strange situation. I created a collection with 4 ndoes (separate servers, numShards=4),
I then proceeded to index data ... all has been seemingly well until this morning when I had
to reboot one of the nodes.
> 
> After reboot, the node I rebooted went into recovery mode! This is completely illogical
as there is 1 shard per node (no replicas).
> 
> What could have possibly happened to 1) trigger a recovery and; 2) have the node think
it has a replica to even recover from?
> 
> Looking at the graph from the SOLR admin page it shows that shard1 disappeared and the
server that was rebooted appears in a recovering state under the server home to shard2.
> 
> I then looked at clusterstate.json and it confirms that shard1 is completely missing
and shard2 now has a replica. ... I'm baffled, confused, dismayed.
> 
> Versions:
> Solr 4.4 (4 nodes with tomcat container)
> zookeeper-3.4.5 (5-node ensemble)
> 
> Oh, and I'm assuming shard1 is completely corrupt.
> 
> I'd really appreciate any insight.
> 
> David
> 
> PS I have a copy of all the shards backed up. Is there a way to possibly rsync shard1
back into place and "fix" clusterstate.json manually?


Mime
View raw message