lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: How to replace a solr cloud node
Date Sun, 17 Jul 2016 23:03:50 GMT
I recommend against manually editing your znodes,
that's like a recipe for disaster unless you know
_exactly_ what you are doing. One mistake and you
risk your collections. And since you're "continuously
indexing", I don't see what good it would do you anyway
since by the time you were finished editing the znode and
(presumably) copying the index over it would be out of
sync with the leader and do a full replication anyway.

Instead, just bring up a 5th Solr node and use the
Collections API ADDREPLICA command to add a new
replica on it corresponding to each replica on the node
you're replacing. All the replication & etc will just happen
automatically, no down time, no problems.

You can specify exactly what node the replica goes on
etc.

I'd then issue a DELETEREPLICA on all the replicas on
the bad node to remove them from the cluster state.

So at the end of the process you may have shards like
collection1_shard1_replica1, collection1_shard1_replica3
Not having collection1_shard1_replica2 is of no
consequence.

One caution. While the replica is being added and while
the sync is going on, incoming updates will be written
to the new replica's tlog and replayed as the final step in the
sync. Under very heavy indexing loads (thousands of docs
per second) the sync can take quite a while to complete.
You do _NOT_ need to stop indexing or even throttle it,
but if you can reduce the indexing load your ADDREPLICA
steps will go faster.

Best,
Erick

On Sun, Jul 17, 2016 at 1:21 PM, vidit.asthana <vidit.asthana7@gmail.com> wrote:
> I have a 4 machine cluster with ~100 collections. Each collection has
> numShards=2 and replicationFactor=2.  Data directory size of each node is
> ~120GB.  One of my node is having some hardware issue, so I need to replace
> it. How can I do that without taking whole cluster down. IP of new node will
> be different. Solr version is 5.1.0.
>
> I cannot take a downtime. Continuous indexing and querying is happening. I
> know how to do it by manually editing state.json of all collections but I
> think its unsafe to do it when cluster is up and might create inconsistency.
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/How-to-replace-a-solr-cloud-node-tp4287556.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message