lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "SolrCloud2" by Mark Miller
Date Sat, 14 Jan 2012 18:16:00 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "SolrCloud2" page has been changed by Mark Miller:
http://wiki.apache.org/solr/SolrCloud2?action=diff&rev1=3&rev2=4

  Solr embeds and uses Zookeeper as a repository for cluster configuration and coordination
- think of it as a distributed filesystem that contains information about all of the Solr
servers.
  
  === Example A: Simple two shard cluster ===
+ {{http://people.apache.org/~markrmiller/2shard2server.jpg}}
+ 
  This example simply creates a cluster consisting of two solr servers representing two different
shards of a collection.
  
  Since we'll need two solr servers for this example, simply make a copy of the example directory
for the second server.
@@ -58, +60 @@

  If at any point you wish to start over fresh or experiment with different configurations,
you can delete all of the cloud state contained within zookeeper by simply deleting the solr/zoo_data
directory after shutting down the servers.
  
  === Example B: Simple two shard cluster with shard replicas ===
+ {{http://people.apache.org/~markrmiller/2shard4server.jpg}}
+ 
  This example will simply build off of the previous example by creating another copy of shard1
and shard2.  Extra shard copies can be used for high availability and fault tolerance, or
simply for increasing the query capacity of the cluster.
  
  
@@ -90, +94 @@

  To demonstrate fail over for high availability, go ahead and kill any one of the Solr servers
(just press CTRL-C in the window running the server) and and send another query request to
any of the remaining servers that are up.
  
  === Example C: Two shard cluster with shard replicas and zookeeper ensemble ===
+ {{http://people.apache.org/~markrmiller/2shard4server2.jpg}}
+ 
  The problem with example B is that while there are enough Solr servers to survive any one
of them crashing, there is only one zookeeper server that contains the state of the cluster.
 If that zookeeper server crashes, distributed queries will still work since the solr servers
remember the state of the cluster last reported by zookeeper.  The problem is that no new
servers or clients will be able to discover the cluster state, and no changes to the cluster
state will be possible.
  
  Running multiple zookeeper servers in concert (a zookeeper ensemble) allows for high availability
of the zookeeper service.  Every zookeeper server needs to know about every other zookeeper
server in the ensemble, and a majority of servers are needed to provide service.  For example,
a zookeeper ensemble of 3 servers allows any one to fail with the remaining 2 constituting
a majority to continue providing service.  5 zookeeper servers are needed to allow for the
failure of up to 2 servers at a time.

Mime
View raw message