lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pouliot, Scott" <>
Subject RE: SOLRCloud on 6.4 on Ubuntu
Date Fri, 24 Feb 2017 17:15:04 GMT
I actually started out using 3.4.9, and a tutorial that was recent recommended using 3.3.x
instead since 3.4 "wasn't ready for production".  I'm fine with either really!  I did read
that in production zk should utilize odd numbers of servers for sure.  1,3,5 etc etc etc for
redundancy purposes for a chunk of your cloud doesn't go dead with your zk server.  3 servers
provides better coverage since if one dies, you still have 66% of your cloud up etc etc etc.
 I'm doing this setup in Azure more as a proof of concept and to figure out how in the world
to get SOLR Cloud up and running reliably so we can talk about migrating over.

I've definitely read over the 2 links you shared, and while I understand them....the lightbulb
still hasn’t lit up yet in my head for that "ah ha!" moment.  ;-)

I plan to try and spin up some new VMs this weekend and start the process over again.  It's
gotta work one of these times!

Thanks for the info!

-----Original Message-----
From: Shawn Heisey [] 
Sent: Friday, February 24, 2017 11:34 AM
Subject: Re: SOLRCloud on 6.4 on Ubuntu

On 2/23/2017 2:12 PM, Pouliot, Scott wrote:
> I'm trying to find a good beginner level guide to setting up SolrCloud NOT using the
example configs that are provided with SOLR.
> Here are my goals (and the steps I have done so far!):
> 1.       Use an external Zookeeper server
> a.       wget

Solr includes the 3.4.6 version of the Zookeeper client.  I would strongly recommend that
the servers be running the latest 3.4.x version, currently 3.4.9.  Although I cannot say for
sure, it's entirely possible that Solr uses ZK client features that are not supported by an
earlier server version.

I've omitted the rest of the zookeeper steps you mentioned.  They look fine, as long as the
configuration is OK and the version is new enough. 
Another bit of info:  You do know that Zookeeper requires three separate physical servers
for a redundant install, I hope?  One or two servers is not enough.

> 2.       Install SOLR on both nodes
> a.       wget
> b.       tar xzf solr-6.4.1.tgz solr-6.4.1/bin/ --strip-components=2
> c.       ./ solr-6.4.1.tgz
> d.       Update to include the ZKHome variable set to my ZK server's ip on
port 2181
> Now it seems if I start SOLR manually with bin/solr start -c -p 8080 -z <ZK IP>:2181
then it will actually load, but if I let it auto start, I get an HTTP 500 error on the Admin
UI for SOLR.

Again ... you need three ZK servers for redundancy, so the setting for -z needs to reference
all three, and probably should have a chroot.  You can set all of those startup parameters
by configuring variables in /etc/default/ of starting it manually.  The copy
of that's in the bin directory is NOT used when running as a service.

> I also can't seem to figure out what I need to upload into Zookeeper as far as configuration
files go.  I created a test collection on the instance when I got it up one time...but it
has yet to start properly again for me.

Use the upconfig command with zkcli or the zk command on the solr script.  The directory you
are uploading should contain everything in a core config that's normally in the "conf" directory
-- solrconfig.xml, the schema, and any files referenced by either of those.

> Are there any GOOD tutorials out there?  I have read most of the 
> documentation I can get my hands on thus far from Apache, and blogs 
> and such, but the light bulb still has not lit up for me yet and I 
> feel like a n00b  ;-)

There's a quick start.  This URL shows how to start a SolrCloud example where Zookeeper is
embedded within one of the Solr nodes, and everything's on one machine.  This setup is not
suitable for production.

This is some more detailed info about migrating to production:

Information about setting up a redundant external Zookeeper is best obtained from the Zookeeper
project.  They understand their software best.

> My company is currently running SOLR in the old master/slave config and I'm trying to
setup a SOLRCloud so that we can toy with it in a Dev/QA Environment and see what it's capable
of.  We're currently running 4 separate master/slave SOLR server pairs in production to spread
out the load a bit, but I'd rather see us migrate towards a cluster/cloud scenario to gain
some computing power here!

What SolrCloud offers is much easier management and a true cluster with no masters and no
slaves.  Depending on how the master-slave architecture is used, SolrCloud can actually be
a step down in performance, but it is generally easier to get a redundant and sharded collection
operational.  The possible performance disadvantage is not usually extreme, and exists because
all replicas handle their own indexing, rather than having slaves that copy the completed
index from the master.


View raw message