lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shalin Shekhar Mangar <shalinman...@gmail.com>
Subject Re: Cluster state ranges are all null after reboot
Date Wed, 26 Feb 2014 12:46:39 GMT
If you have 15 shards and assuming that you've never used shard
splitting, you can calculate the shard ranges by using new
CompositeIdRouter().partitionRange(15, new
CompositeIdRouter().fullRange())

This gives me:
[80000000-9110ffff, 91110000-a221ffff, a2220000-b332ffff,
b3330000-c443ffff, c4440000-d554ffff, d5550000-e665ffff,
e6660000-f776ffff, f7770000-887ffff, 8880000-1998ffff,
19990000-2aa9ffff, 2aaa0000-3bbaffff, 3bbb0000-4ccbffff,
4ccc0000-5ddcffff, 5ddd0000-6eedffff, 6eee0000-7fffffff]

Have you done any more investigation into why this happened? Anything
strange in the logs? Are you able to reproduce this in a test
environment?

On Wed, Feb 19, 2014 at 5:16 AM, Greg Pendlebury
<greg.pendlebury@gmail.com> wrote:
> We've got a 15 shard cluster spread across 3 hosts. This morning our puppet
> software rebooted them all and afterwards the 'range' for each shard has
> become null in zookeeper. Is there any way to restore this value short of
> rebuilding a fresh index?
>
> I've read various questions from people with a similar problem, although in
> those cases it is usually a single shard that has become null allowing them
> to infer what the value should be and manually fix it in ZK. In this case I
> have no idea what the ranges should be. This is our test cluster, and
> checking production I can see that the ranges don't appear to be
> predictable based on the shard number.
>
> I'm also not certain why it even occurred. Our test cluster only has a
> single replica per shard, so when a JVM is rebooted the cluster is
> unavailable... would that cause this? Production has 3 replicas so we can
> do rolling reboots.



-- 
Regards,
Shalin Shekhar Mangar.

Mime
View raw message