zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <apa...@elyograg.org>
Subject Re: Split a large ZooKeeper cluster into multiple separate clusters
Date Wed, 07 Sep 2016 22:04:22 GMT
On 9/7/2016 3:19 PM, Eric Young wrote:
> I have a very large ZooKeeper cluster which manages config and replication
> for multiple SolrCloud clusters.  I want to split the monolithic ZooKeeper
> cluster into smaller, more manageable clusters in a live migration (i.e.
> minimal or no downtime).

The zookeeper list isn't really the right place for most of this.  The
residents of this list will to have zero knowledge of how Solr uses
zookeeper.  I'm on both lists -- and I'm a lot more familiar with Solr
than Zookeeper.

Because Solr normally will not place a large load on zookeeper, I
personally would just use one zookeeper ensemble for both SolrCloud
clusters, each using a different chroot in zookeeper.  I'd use either
three or five ZK servers, depending on how likely I thought it would be
that I would need to survive two servers going down.

That's not what you asked about though, so I will attempt to help you
with what you DID ask about.

> I have collections that can be updated dynamically which are already
> separated logically in different SolrCloud clusters.  I also have some
> static collections (never updated) that have replicas across all the
> SolrCloud clusters though.  All my collections only have a single shard.
>
> ZooKeeper version: 3.4.6
> Solr version: 4.8.1
>
> Example current setup (minimal):
> ZK cluster servers:  z1-1, z1-2, z1-3, z2-1, z2-2, z2-3
> Solr cluster 1 servers: s1-1, s1-2
> Solr cluster 2 servers: s2-1, s2-2
>
> Example collections:
> Dynamic collection 1: c1 (sharded on s1-1, s1-2)
> Dynamic collection 2: c2 (sharded on s2-1, s2-2)
> Static collection 1: c3 (sharded on all 4 Solr servers s1-1, s1-2, s2-1,
> s2-2)

If you have a collection that has replicas on all four Solr servers,
then your four solr servers are *one* SolrCloud cluster, not two.  If
they were separate clusters, it would not be possible to have one
collection with shards/replicas on all four servers.

I really don't know what to do for the zookeeper part of this equation. 
Somebody else on this list will need to answer that.

Downtime is not going to be avoidable.  With careful planning and
execution, you might be able to minimize it.

The first thing you need to do is rearrange the static collection so it
only lives on two of the Solr servers.  To do this, you can use
ADDREPLICA if addiitonal replicas are required, then DELETEREPLICA to
remove it from two of the servers.

At this point, you'll need to shut down all instances of Solr, make
whatever changes are required to split the zookeeper cluster (which I
can't help you with), and update zkHost in Solr so that each pair of
servers only talks to the servers in its cluster.

After making sure that both zk ensembles have all the information in
them, you would then start your Solr servers back up.

Then you'll want to manually edit the two clusterstates to remove all
mention of the collections and servers that don't belong in each
cluster, and after making sure each clusterstate is correct, restart all
the Solr servers.

You *might* be able to just use the DELETE action on the Collections API
to delete collections instead of manually editing clusterstate, but I'm
not 100% positive about that.

Thanks,
Shawn


Mime
View raw message