zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Grotzke <martin.grot...@googlemail.com>
Subject Re: Reconfig without quorum
Date Thu, 18 Sep 2014 07:48:22 GMT
On 09/18/2014 08:24 AM, "Jürgen Wagner (DVT)" wrote:
> What works for me with SolrCloud and our search platform based on 
> this: run Zookeeper in virtual machines in one location and only 
> observer nodes in the other.

Ha, great idea! I need to check without ops people if we can provision
zones differently.

> If your applications follow the same pattern, the updates will mostly
> occur where the ensemble is located, while the other Zookeepers in
> the redundant location will only "listen".

What do you mean with "apps follow the same pattern"?
Our app server are running equally in both zones, if this is what you're
referring to. I'd like to have Solr nodes equally distributed over zones
as well, sharing the same configuration.

> In the case of a failure of the main site, simply move the virtual 
> machines of the ensemble over to the other location and switch 
> roles.

How should VMs be moved over to the working location? We're running on
OpenStack (probably Rackspace), not sure what's possible there.

Btw, our 2 zones are both active, located in the same DC, sharing the
same network switch, but they're located in different fire compartments.

> This would probably also mirror what you do with the applications. 
> With SolrCloud, you would partition your nodes into two nodesets,
> one for each data center. Make sure that one set of replica for each 
> shard is located in one data center, the other in the other.

Does this make sense with our setup as well?

> If redundant and reliable feeding really is important (e.g., because 
> the data is not persistent and cannot be reproduced), you will 
> probably be better off with two independent SolrCloud instances, one 
> per data center, and some reliable message delivery in front for 
> feeding.

Our search/query availability is important. Data can be reproduced, we
could rebuild a solr index in a fairly short amount of time (~10 - 30 min).

> Finally, consider also management procedures in handling a 
> virtualized environment with the SolrCloud and Zookeeper nodes. If 
> you employ a cloud management platform handling service migration 
> between sites, this may be an even easier solution. Dead Zookeepers 
> and SolrCloud nodes would automagically pop up, resurrected in the 
> surviving location in case one data center should fail.

Ok, I probably discuss this kind of stuff more with our ops team ;-)

Do you some more concrete advices that we could discuss?

> A really final note: as we face this issue in customer scenarios as 
> well, we are also looking into not using Zookeeper for this purpose, 
> but rather Cassandra instances. This leads to a somewhat different 
> interaction model between Solr instances, but may be better suited 
> esp. for the partitioning case. Bad news: yes, we're on our own with 
> this. No standard support from Solr for Cassandra yet.

Yeah, we're using C* as well, it would be really be great to profit from
its availability strengths and put a coordination service on top.

> So, there are several approaches how this could be handled. Which
> one is the best for you is left to decide on the precise topology 
> requirements and platform capabilities. Unfortunately (or luckily
> for consulting companies in the field :-), there is not a single,
> easy approach that works for all.

Thanks for your valuable input,

> Best regards, --Jürgen
> On Wed, Sep 17, 2014 at 1:19 PM, Martin Grotzke < 
> martin.grotzke@googlemail.com> wrote:
>>> Hi,
>>> is it true, that the reconfig command that's available since 
>>> 3.5.0 can only be used if there's a quorum?
>>> Our situation is that we have 2 datacenters (actually only 2 
>>> zones within the same DC) which will be provisioned equally, so 
>>> that we'll have an even number of ZK nodes (true, not optimal). 
>>> When 1 zone fails, there won't be a quorum any more and ZK will 
>>> be unavailable - that's my understanding. Is it possible to add 
>>> new nodes to the ZK cluster and achieve a quorum again while the 
>>> failed zone is still unavailable?
>>> What would you recommend how to handle this situation?
>>> We're using (going to use) SolrCloud as clients.
>>> Thanks && cheers, Martin
> --
> Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С
> уважением *i.A. Jürgen Wagner* Head of Competence Center
> "Intelligence" & Senior Cloud Consultant
> Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany Phone: +49 
> 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 1543 
> E-Mail: juergen.wagner@devoteam.com 
> <mailto:juergen.wagner@devoteam.com>, URL: www.devoteam.de 
> <http://www.devoteam.de/>
> ------------------------------------------------------------------------
Managing Board: Jürgen Hatzipantelis (CEO)
> Address of Record: 64331 Weiterstadt, Germany; Commercial Register: 
> Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071

inoio gmbh - http://inoio.de
Schulterblatt 36, 20357 Hamburg
Amtsgericht Hamburg, HRB 123031
Geschäftsführer: Dennis Brakhane, Martin Grotzke, Ole Langbehn

View raw message