Return-Path: Delivered-To: apmail-lucene-solr-commits-archive@minotaur.apache.org Received: (qmail 15887 invoked from network); 2 Feb 2010 16:39:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 2 Feb 2010 16:39:14 -0000 Received: (qmail 16967 invoked by uid 500); 2 Feb 2010 16:39:13 -0000 Delivered-To: apmail-lucene-solr-commits-archive@lucene.apache.org Received: (qmail 16900 invoked by uid 500); 2 Feb 2010 16:39:13 -0000 Mailing-List: contact solr-commits-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-dev@lucene.apache.org Delivered-To: mailing list solr-commits@lucene.apache.org Received: (qmail 16891 invoked by uid 99); 2 Feb 2010 16:39:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Feb 2010 16:39:12 +0000 X-ASF-Spam-Status: No, hits=-1998.5 required=10.0 tests=ALL_TRUSTED,WEIRD_PORT X-Spam-Check-By: apache.org Received: from [140.211.11.130] (HELO eos.apache.org) (140.211.11.130) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Feb 2010 16:39:09 +0000 Received: from eos.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id 9AEFB17D1B for ; Tue, 2 Feb 2010 16:38:48 +0000 (GMT) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Apache Wiki To: Apache Wiki Date: Tue, 02 Feb 2010 16:38:48 -0000 Message-ID: <20100202163848.5780.9517@eos.apache.org> Subject: =?utf-8?q?=5BSolr_Wiki=5D_Trivial_Update_of_=22SolrCloud=22_by_YonikSeele?= =?utf-8?q?y?= X-Virus-Checked: Checked by ClamAV on apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for chan= ge notification. The "SolrCloud" page has been changed by YonikSeeley. The comment on this change is: start embedded ensemble example. http://wiki.apache.org/solr/SolrCloud?action=3Ddiff&rev1=3D22&rev2=3D23 -------------------------------------------------- = If you haven't yet, go through the simple [[http://lucene.apache.org/solr= /tutorial.html|Solr Tutorial]] to familiarize yourself with Solr. = - Solr embeds and uses Zookeeper as a repository for cluster configuration = and coordination - think of it as a distributed filesystem. + Solr embeds and uses Zookeeper as a repository for cluster configuration = and coordination - think of it as a distributed filesystem that contains in= formation about all of the Solr servers. = Since we'll need two solr servers for this example, simply make a copy of= the example directory for the second server. = {{{ cp -r example example2 }}} - =3D=3D=3D Simple two shard cluster =3D=3D=3D + =3D=3D=3D Example A: Simple two shard cluster =3D=3D=3D This example simply creates a cluster consisting of two solr servers repr= esenting two different shards of a collection. = Since we'll need two solr servers for this example, simply make a copy of= the example directory for the second server. @@ -75, +75 @@ = If at any point you wish to start over fresh or experiment with different= configurations, you can delete all of the cloud state contained within zoo= keeper by simply deleting the solr/zoo_data directory after shutting down t= he servers. = - =3D=3D=3D Simple two shard cluster with shard replicas =3D=3D=3D + =3D=3D=3D Example B: Simple two shard cluster with shard replicas =3D=3D= =3D This example will simply build off of the previous example by creating an= other copy of shard1 and shard2. Extra shard copies can be used for high a= vailability and fault tolerance, or simply for increasing the query capacit= y of the cluster. = First, run through the previous example so we already have two shards and= some documents indexed into each. Then simply make a copy of those two se= rvers: @@ -100, +100 @@ = http://localhost:7500/solr/collection1/select?distrib=3Dtrue&q=3D*:* = - Send this query multiple times and observe the logs from the solr servers= . From your web browser, you may need to hold down CTRL while clicking on = the browser refresh button to bypass the HTTP caching in your browser. You= should be able to observe Solr load balancing the requests across shard re= plicas, using different servers to satisfy each request. There will be a l= og statement for the top-level request in the server the browser sends the = request to, and then a log statement for each sub-request that are merged t= o produce the complete response. = + Send this query multiple times and observe the logs from the solr servers= . From your web browser, you may need to hold down CTRL while clicking on = the browser refresh button to bypass the HTTP caching in your browser. You= should be able to observe Solr load balancing the requests across shard re= plicas, using different servers to satisfy each request. There will be a l= og statement for the top-level request in the server the browser sends the = request to, and then a log statement for each sub-request that are merged t= o produce the complete response. + = + To demonstrate fail over for high availability, go ahead and kill any one= of the Solr servers (just press CTRL-C in the window running the server) a= nd and send another query request to any of the remaining servers that are = up. + = + =3D=3D=3D Two shard cluster with shard replicas and zookeeper ensemble = =3D=3D=3D + The problem with example B is that while there are enough Solr servers to= survive any one of them crashing, there is only one zookeeper server that = contains the state of the cluster. If that zookeeper server crashes, distr= ibuted queries will still work since the solr servers remember the state of= the cluster last reported by zookeeper. The problem is that no new server= s or clients will be able to discover the cluster state, and no changes to = the cluster state will be possible. + = + Running multiple zookeeper servers in concert (a zookeeper ensemble) allo= ws for high availability of the zookeeper service. Every zookeeper server = needs to know about every other zookeeper server in the ensemble, and a maj= ority of servers are needed to provide service. For example, a zookeeper e= nsemble of 3 servers allows any one to fail with the remaining 2 constituti= ng a majority to continue providing service. 5 zookeeper servers are neede= d to allow for the failure of up to 2 servers at a time. + = + For production, it's recommended that you run an external zookeeper ensem= ble rather than having Solr run embedded zookeeper servers. For this examp= le, we'll use the embedded servers for simplicity. + = + First, stop all 4 servers and then clean up the zookeeper data directorie= s for a fresh start. + {{{ + rm -r example*/solr/zoo_data + }}} + = + We will be running the servers again at ports 8983,7574,8900,7500. The d= efault is to run an embedded zookeeper server at hostPort+1000, so if we ru= n an embedded zookeeper on the first three servers, the ensemble address wi= ll be {{{localhost:9983,localhost:8574,localhost:9900}}}. + = + As a convenience, we'll have the first server upload the solr config to t= he cluster. You will notice it block until you have actually started the s= econd server. This is due to zookeeper needing a quorum before it can oper= ate. + = + NOTE: this doesn't work yet because the client of the second server check= s for the collection config before the first has finished uploading it, and= the first server needs to wait until the second server starts to establish= a quorum to start uploading. + {{{ + cd example + java -Dbootstrap_confname=3Dmyconf -Dbootstrap_confdir=3D./solr/conf -Dzk= Run -DzkHost=3Dlocalhost:9983,localhost:8574,localhost:9900 -jar start.jar + }}} + = + {{{ + cd example2 + java -Djetty.port=3D7574 -DhostPort=3D7574 -DzkRun -DzkHost=3Dlocalhost:9= 983,localhost:8574,localhost:9900 -jar start.jar + }}} + = + {{{ + cd exampleB + java -Djetty.port=3D8900 -DhostPort=3D8900 -DzkRun -DzkHost=3Dlocalhost:9= 983,localhost:8574,localhost:9900 -jar start.jar + }}} + = + {{{ + cd example2B + java -Djetty.port=3D7500 -DhostPort=3D7500 -DzkHost=3Dlocalhost:9983,loca= lhost:8574,localhost:9900 -jar start.jar + }}} = = =3D=3D ZooKeeper =3D=3D - Multiple Zookeeper servers running together for fault tolerance and high = availability is called an ensemble. For production, it's recommended that = you run an external zookeeper ensemble rather than having Solr run embedded= servers. + Multiple Zookeeper servers running together for fault tolerance and high = availability is called an ensemble. For production, it's recommended that = you run an external zookeeper ensemble rather than having Solr run embedded= servers. See the [[http://hadoop.apache.org/zookeeper/|Apache ZooKeeper]]= site for more information on downloading and running a zookeeper ensemble. = When Solr runs an embedded zookeeper server, it defaults to using the sol= r port plus 1000 for the zookeeper client port. In addition, it defaults t= o adding one to the client port for the zookeeper server port, and two for = the zookeeper leader election port. So in the first example with Solr runn= ing at 8983, the embedded zookeeper server used port 9983 for the client po= rt and 9984,9985 for the server ports. =20