Hi Steve, I don't think there are official ZooKeeper documentation regarding best practices for adding / removing replicas prior to 3.5.0 or how to do a rolling restart. The community is working towards 3.5.3 release which will remove the alpha tag. So 3.5.3 might be good for you if you want a stable release with dynamic reconfig feature. It's unclear to me about the timeframe of the release though. On Thu, Oct 20, 2016 at 9:26 AM, Steve Newman wrote: > Thanks for the pointer. This looks like a nice improvement. > > If I'm reading http://zookeeper.apache.org/releases.html correctly, this > feature is only available in alpha release. In the near term, I need a > procedure I can follow for a stable release. Is there any documentation > regarding best practices for adding / removing replicas prior to 3.5.0? The > 3.5.2 documentation you linked to is somewhat alarming regarding prior > releases: > > "Prior to the 3.5.0 release, the membership and all other configuration > parameters of Zookeeper were static - loaded during boot and immutable at > runtime. Operators resorted to ''rolling restarts'' - a manually intensive > and error-prone method of changing the configuration *that has caused data > loss and inconsistency in production*." > > Thanks, > Steve > > On Thu, Oct 20, 2016 at 8:03 AM, Rakesh Radhakrishnan > wrote: > > > Hi Steve, > > > > I'd suggest you to look at ZooKeeper-3.5.2 latest version and use dynamic > > reconfig feature. This will help to resize(add/remove zk server) your > > cluster without restarting entire cluster. > > > > Please refer the following links to understand more about the dynamic > > reconfig feature:- > > https://zookeeper.apache.org/doc/r3.5.2-alpha/zookeeperReconfig.html > > http://www.slideshare.net/Hadoop_Summit/dynamic- > > reconfiguration-of-zookeeper > > > > Regards, > > Rakesh > > > > On Thu, Oct 20, 2016 at 3:19 AM, Steve Newman wrote: > > > >> Apologies for a basic question, but I've been researching and haven't > been > >> able to find the answer online. > >> > >> What is the best way to add or remove replicas from a running ZooKeeper > >> cluster, with minimal downtime? To add a replica, the naive answer would > >> seem to be: > >> > >> 1. Prepare the new replica(s), i.e. install ZooKeeper and set up the > >> configuration files. > >> 2. Edit the configuration for all replicas (new and existing) to list > the > >> new replicas. > >> 3. Restart all replicas. (Simultaneously? Or gradually, one at a time?) > >> > >> Is this the best way to do it? Step 3 seems scary in a production > cluster. > >> Also, will the new replicas smoothly pick up the existing data, or is it > >> better to seed them with a snapshot somehow? > >> > >> Similarly, the naive answer for removing a replica would seem to be: > >> > >> 1. Halt the ZooKeeper process. > >> 2. Edit the configuration for all other replicas to remove the replica > >> that's going away. > >> 3. Restart all remaining replicas (one at a time?). > >> > >> Again, is this the best approach? > >> > >> Thanks, > >> Steve > >> > > > > > -- Cheers Michael.