lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <apa...@elyograg.org>
Subject Re: Cross DC SolrCloud anti-patterns in presentation shalinmangar/cross-datacenter-replication-in-apache-solr-6
Date Sat, 24 Jun 2017 15:14:28 GMT
On 6/24/2017 2:14 AM, Arcadius Ahouansou wrote:
> Interpretation 1:
>
> - On slide 6 and 7: Only 2 DC used, so the ZK quorum will not survive and recover after
1 DC failure
>
> - On slide 8: We have 3 DCs which OK for ZK.
> But we have 6 ZK nodes.
> This is a problem because ZK likes 3, 5, 7 ... odd nodes.

On both slide 6 and slide 7, Solr stays completely operational in DC1 if
DC2 goes down.  It all falls apart if DC1 goes down.  For clients that
can still reach them, the remaining Solr servers are read only in that
situation.

Slide 8 is very similar -- if DC1 goes down, Solr is read only.  If
either DC2 or DC3 goes down, everything is fine for clients that can
still get to Solr.  One additional consideration: If both DC2 and DC3 go
down, then the remaining Solr severs in DC1 are read only.

ZooKeeper doesn't *need* an odd number of servers, but there's no
benefit to an even number.  If you have 5 servers, two can go down.  If
you have 6 servers, you can still only lose two, so you might as well
just run 5.  You'd have fewer possible points of failure, less power
usage, and less bandwidth usage.

The best minimum option is an odd number of data centers, minimum 3,
with one zookeeper in each location.  For Solr, you want at least two
servers, which should be split evenly between at least two of those
datacenter locations.

If you're really stuck with only two datacenters, then you can follow
the advice in the presentation: Set up a full cloud in each datacenter
and use CDCR between them.

> Interpretation 2:
>
> Any SolrCloud deployment with "Remote SolrCloud nodes" i.e. solrCloud not in same DC
as ZK is deemed an anti-pattern (note that DCs can be just a couple of miles apart and could
be connected by high speed network)

I'm not sure that this is actually true, but it does introduce latency
and more moving parts in the form of network connections between data
centers -- connections which might go down.  I wouldn't do it, but I
also wouldn't automatically dismiss it as a viable setup, as long as it
meets ZooKeeper's requirements and there are two complete copies of the
Solr collections, each in different data centers.

Typical designs only stay viable if one datacenter goes down, but if you
were to use five datacenters and have enough Solr servers for three
complete copies of your collections, you could survive two data center
outages.

Thanks,
Shawn


Mime
View raw message