lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <apa...@elyograg.org>
Subject Re: solr cloud does not start with many collections
Date Wed, 04 Mar 2015 02:18:32 GMT
On 3/2/2015 12:54 AM, Damien Kamerman wrote:
> I still see the same cloud startup issue with Solr 5.0.0. I created 4,000
> collections from scratch and then attempted to stop/start the cloud.

I have been trying to duplicate your setup using the "-e cloud" example
included in the Solr 5.0 download and accepting all the defaults.  This
sets up two Solr instances on one machine, one of which runs an embedded
zookeeper.

I have been running into a LOT of issues just trying to get so many
collections created, to say nothing about restart problems.

The first problem I ran into was heap size.  The example starts each of
the Solr instances with a 512MB heap, which is WAY too small.  It
allowed me to create 274 collections, in addition to the gettingstarted
collection that the example started with.  One of the Solr instances
simply crashed.  No OutOfMemoryException or anything else in the log ...
it just died.

I bumped the heap on each Solr instance to 4GB.  The next problem I ran
into was the operating system limit on the number of processes ... and I
had already bumped that up beyond the usual 1024 default, to 4096.  Solr
was not able to create any more threads, because my user was not able to
fork any more processes.  I got over 700 collections created before that
became a problem.  My max open files had also been increased already --
this is another place where a stock system will run into trouble
creating a lot of collections.

I fixed that, and the next problem I ran into was total RAM on the
machine ... it turns out that with two Solr processes each using 4GB, I
was dipped 3GB deep into swap.  This is odd, because I have 12GB of RAM
on that machine and it's not doing very much besides this SolrCloud
test.  Swapping means that performance was completely unacceptable and
it would probably never finish.

So ... I had to find a machine with more memory.  I've got a dev server
with 32GB.  I fired up the two SolrCloud processes on it with 5GB heap
each, with 32768 processes allowed.  I am in the process of building
4000 collections (numShards=2, replicationFactor=1), and so far, it is
working OK.  I have almost 2700 collections now.

If I can ever get it to actually build 4000 collections, then I can
attempt restarting the second Solr instance and see what happens.  I
think I might hit another roadblock in the form of the
10000 maxThreads limit on Jetty.  Running this all on one machine might
not be possible, but I'm giving it a try.

Here's the script I am using to create all those collections:

#!/bin/sh

for i in `seq -f "%04.0f" 0 3999`
do
  echo $i
  coll=mycoll${i}
  URL="http://localhost:8983/solr/admin/collections"
  URL="${URL}?action=CREATE&name=${coll}&numShards=2&replicationFactor=1"
  URL="${URL}&collection.configName=gettingstarted"
  curl "$URL"
done

Thanks,
Shawn

Mime
View raw message